Site Recovery Manager (SRM) Expired Certificate

So it’s that time of the year again, your Site Recovery Manager (SRM) appliance needs an update and you’ve found that the certificate has also expired. It seems that SRM will continue to function even with this expired cert for as long as the appliance stays up or the next time you get around to updating it. However once you do update or reboot you’ll find that SRM will be less than enthusiastic to reconnect to your existing environment. This happened to me about a month ago when we hit a bug that required an update to our appliances. In this instance, the bug was in replacing DNS servers (we were moving to a new fandangled DNS setup) where it’s impossible to remove old DNS servers and the newly added nameservers get appended onto the end of your existing server list.

Anyway! So we updated our SRM appliance to get around this bug however upon logging back into the VAMI we found a big red banner ordering us to reconfigure our appliance to reconnect to vcenter. This reconfigure would fail with:

Exit code: 61
[backtrace begin] product: VMware vCenter Site Recovery Manager, version: 8.2.1, build: build-17078491, tag: drconfig, cpu: x86_64, os: linux, buildType: release
backtrace[03][0x0018C58B]: Vmacore::Throwable::Throwable(std::string)
backtrace[04] dr-configurator[0x000BA1A8]
backtrace[05] dr-configurator[0x000BF702]
backtrace[06] dr-configurator[0x00069255]
backtrace[07] dr-configurator[0x00069E6B]
backtrace[08] dr-configurator[0x0006AEDF]
backtrace[09] dr-configurator[0x00070717]
backtrace[10] dr-configurator[0x0005F2DB]
backtrace[11] dr-configurator[0x000750DC]
[backtrace end]
Caused by:
faultCause = (vmodl.MethodFault) null,
faultMessage = unset,
invalidProperty = "Invalid certificate"
msg = "Received SOAP response fault from [cs p:00007f66d801b7f0, TCP:vcenter1.amatismvdc.local:443]: create

Now obviously this is a most suboptimal message to be presented with after what should be a simple reconnect operation. You can see the line invalidProperty = "Invalid certificate" is indicating that something is wrong with our cert. Fear not, the answer is simple, head over to the ‘Access’ tab on the left and you’ll find that the certificate probably expired a couple of months ago.

Screenshot showing an expired certificate in the SRM VAMI
You’ll find that the ‘expires on’ date has probably already passed.

If this is the case, then simply click ‘Change’ and and generate a new cert making sure that the FQDN and IP addresses are correct. The interface should reload and you’ll almost certainly need to accept the new cert in whichever browser you chose to sell your soul to. I found that this can happen either instantly or within a few minutes. The UI is doesn’t really seem to tell you what’s going on during this time. If you hit any difficulties, according to VMware support, the best course is to perform a hard reboot which is… an interesting way of fixing it. Either way, once the new cert is in place, the reconfigure operation should go ahead as planned. You may need to reconnect the two sites in SRM itself which is a simple enough task to perform.

As a side note, VMware support first time around thought the issue lay with an expired vcenter cert which meant regenerating a completely new cert in vcenter. Depending on the services connected to your vcenter this is a massively disruptive thing to do as all connected solutions (such as Cloud Director, SRM, vROPS et al) need to manually be reconnected which can take ages to fix. In the end, it was actually a very simple cert replacement that Damian (you’re a legend Damian, with the patience of a saint) at VMware support was kind enough to point out, although we had other issues with the appliance that resulted in this simple job taking upwards of half a day. Needless to say, I think we needed a strong coffee after all that… ☕️



Leave a Reply

Your email address will not be published. Required fields are marked *