This guest post is by Bryan Krausen who blogs at ITDiversified, where you can find his back catalogue of posts. Find out more about the guest blogger program here.
VMware’s Site Recovery Manager provides the ability to failover VMs in the event of a disaster or for planned migrations between datacenters. In my case, we’re using it for the former as we have two “main” datacenters, one located in Louisville and the other in downtown Chicago. Although we’ve tried to deploy applications with site redundancy in mind, some applications are simply reliant on a single site, whether by limitations of the software or by choice.
Decisions, decisions… Planning a VMware SRM deployment takes considerable thought as there are many choices and routes to accomplish the same tasks. For example, we could choose to utilize EMC’s RecoverPoint or VMware vSphere Replication to replicate the data to the opposite site. We could choose to create a dedicated “DR” VLAN or should we explore spanning Layer 2 across the datacenters using VXLAN or OTV. Should we purchase additional capacity at both sites or simply suspend “Tier 2/3″ VMs in the event of a failure to ensure resources are available?
This is the infrastructure we’ve decided upon and will be using for SRM. As you can see, I’m fortunate to have such enterprise level gear for such a small company (~500 employees).
SRM Architecture
After all prerequisites were met and deployed (vCenter, RecoverPoint, etc), the time to install SRM was here. As I tried to go through the install process, I ended up restarting it probably 20 times due to all the little nuisances and “picky-ness” of the installer. Some of them were understandable, some of them just straight-up annoying. Below are the “gotchas” that I found when installing SRM. **Note: This was my first time ever installing SRM.
1.) To start, UAC must be turned off on the SRM servers – In Server 2012, moving the UAC bar down to “Never Notify” isn’t good enough. You must completely disable by modifying the following registry setting.
“HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\policies\system”
Modify the DWORD “EnableLUA” from 1 to 0.
You’ll get a notification that a reboot is required. Once rebooted, UAC is finally disabled.
2.) Database Schema must be the same name as service account when using SQL as the backend database.Even if the service account is a db_owner, you’ll get a “Failed to create database tables” error message.
In my case, the service account I’m using is “svc_srm”, requiring the schema to be named the exact same thing.
3.) Run the SRM Installer as Service Account. During the install (during the last step I think) the installer will ask you what account you want to run the service as. Unfortunately the username is grayed out as shown below, displaying the account that you used to run the installer. Therefore, either log into the SRM server as the account itself or simply right-click the installer, while holding the Shift key, and choose “Run As Different User”.
4.) Create the System DSN using an account which has permissions on the DB server. Otherwise you won’t be able to select the correct database. To ensure a proper configuration, run odbcad64.exe as the service account so you’ll have permissions to browse database server and select correct database. Again, either log in as the service account or right click [while holding the shift key] and choose “Run As Different User”.
5.) Service account must be an administrator of SRM server. I didn’t find out what happens if this is not completed first, but the documentation indicates it’s needed, so I’d set it.
6.) Java is required for the RecoverPoint SRA installer. Not sure if it’s related to every SRA but EMC’s RecoverPoint does require it. I had to install an older version since, as always, the newest version didn’t work correctly. Check theJava archives for the version that works for you.
7.) Certificates…everybody’s favorite. In regards to the cert, from a security perspective, it’s best to utilize a trusted certificate. Mint one from your internal CA to ensure it’s trusted by your clients and servers joined to the domain. You can use the following config file, along with OpenSSL, to mint a certificate. Just make sure that the certificate has a Subject Alternative Name added as the name selected in the installation wizard and that all other information within the certificate is identical between the two SRM servers.
If you aren’t familiar with using OpenSSL to create your certificates, make sure to check out Derek Seaman’s blog series on the subject. To get you started, here is the config file that I use to create my key & CSR in which I submitted to my internal CA.
[ req ] default_bits = 2048 default_keyfile = rui.key distinguished_name = req_distinguished_name encrypt_key = no prompt = no string_mask = nombstr req_extensions = v3_req [ v3_req ] basicConstraints = CA:FALSE keyUsage = digitalSignature, keyEncipherment, dataEncipherment extendedKeyUsage = serverAuth, clientAuth subjectAltName = DNS: server_name [ req_distinguished_name ] countryName = US stateOrProvinceName = State localityName = City 0.organizationName = org name organizationalUnitName = IT commonName = SRM <--note that this MUST match for both SRM servers so use something generic
7.) The CA’s Root certificate must be installed on the vCSA if you’re using it, otherwise you’ll get a “chain not completed” error. Upload the root certificate as a .pem file to /etc/ssl/certs (convert .p7b to .pem). Make sure to include entire chain if intermediate certificates are in play. After uploading the root certificate on the vCSA, run the command “c_rehash” to rescan the certificates.
8.) Install SRA after SRM otherwise it won’t be recognized. This may seem silly but I was trying to save time and install it while working on the first server, however, I found out it didn’t work, even after rescanning. I uninstalled and installed if after SRM and it worked fine.
9.) Modifying the installation requires you to be logged in as the Service Account or any other account that has privileges on the database. Otherwise you’ll get a “A database error has occurred.” message when attempting to Modify the installation.
10.) Type in the host name when it asks for “Local Host”. The drop down lists appears it’s looking for the IP address, as well as the wording below it that states “The address on the local host to be used by SRM”. Also, this is the name that SRM will verify the certificate against, so whatever name is entered here should be a Subject Alternative Name on the certificate you provide. (see #7 above).
As you can see, there are quite a few “gotchas” during just the installation of SRM. I have to admit that I had more trouble with the install of SRM than I did with setting the rest up and failing over a VM. If you’re installing SRM, be patient, work through these little nuisances, and you’ll have a solid install base to move forward.