Remove failed PSC install from SSO domain

I help administer an environment that is currently running vSphere 6.0.  The SSO domain was created with an external Platform Services Controller (PSC) and vCenter 6.0 server, both running on Windows 2k12 R2.  At some point down the road, a SQL database crash forced us to restore both the PSC and vCenter server.  The specifics are kind of hazy (or maybe I’ve just tried to bury that memory away for good), but eventually things got back up and running well enough.  There are still some minor quirks here and there, so one thing I’ve had in the back of my mind to do is to deploy a new PSC appliance, join it to the existing SSO domain and point my vCenter to the new PSC.  Then I would decommission the existing Windows PSC server and eventually move towards migrating to a vCenter appliance as well.

I began that process thinking it would be easy enough, I could at least start by deploying an external PSC appliance and joining it the existing domain.  I went made sure my new PSC name had an A record manually registered in DNS and went through the typical vCenter appliance install.  Eventually, the install errored out with failed to run vdcpromo:

2017-08-21 11_08_52-10.10.3.13 - Remote Desktop Connection

I noticed that even though the install failed, the vSphere Web Client showed a new node listed in the configuration:

2017-08-21 11_11_54-10.10.3.13 - Remote Desktop Connection

Some searching brought me to VMware KB2117378, so I logged into the Windows PSC server and opened an Administrator command prompt.  Per the article, I drilled down to the vmdird (VMware Director Service) folder and ran the vdcleavefed command.  I was then presented with my next error:

2017-08-21 11_20_34-10.10.3.13 - Remote Desktop Connection

It seems that with the PSC install not ever getting to the point where the new server actually “joined” the SSO domain, there is nothing officially available to leave the current federation.  Basically, I was stuck with a stale record that vCenter had partially registered, but did not completely federate to the SSO domain.

I went ahead and downloaded a nifty LDAP tool called JXplorer, which is an open source LDAP explorer available for free download.  Using JXplorer, I was able to connect via LDAP to the SSO domain on my existing PSC server:

2017-08-21 13_05_46-10.10.3.13 - Remote Desktop Connection

BEWARE – Just like Active Directory, messing with any of these settings if you don’t explicitly know what you are doing can cause serious harm!  In this case I knew that my PSC install went south and the existing VM could be removed without harming the rest of my SSO environment.

Within JXplorer, you can see the SSO domain info, similarly to what you would see in MS Active Directory.  vCenter servers are listed under the “Computers” group and PSC servers are listed under “Domain Controllers.”  In this instance, I deleted the name of the failed PSC instance and disconnected JXplore.

2017-08-21 13_06_36-10.10.3.13 - Remote Desktop Connection

After the deletion and a quick refresh of my web client, I was back to seeing the two production nodes that I expected to see!  After a bang my head on the desk moment of clarity, I realized that the original vdcpromo error was due to attempting to deploy a fresh install of 6.0 U3, when the production version was only 6.0 U2.  My intent was to upgrade to U3, but rather than deploy the U3 appliance, I will have to first upgrade my 6.0 U2 Windows PSC to U3 and then move forward with the rest of the plan.  The above steps at least got me back to a clean slate so that I can proceed accordingly.

One thought on “Remove failed PSC install from SSO domain

Leave a Reply

Your email address will not be published. Required fields are marked *