ActiveDR Guide | Azure VMware Solution and Pure CBS
Introduction
Disaster recovery planning is a critical piece of any infrastructure – any location. Azure VMware Solution is a VMware environment runs on Microsoft Azure and can extend its storage capacity from the integration with Pure Cloud Block Store. This integration comes with a set of built-in replication features, with an addition of offloading the storage and replication operation to the storage array-level.
One of the replication feature with Pure Arrays is ActiveDR. ActiveDR is a near-sync replication that provides an array-based failover and fallback mechanism to protect your important data in a simple and robust way over great distances with extremely low Recovery Point Objectives (RPO). The simplicity of management makes disaster recovery (and importantly, testing your disaster recovery plans) straight-forward and repeatable.
This document describes how to manage ActiveDR in an Azure VMware Solution using Pure Storage Cloud Manager for AVS.
Scenarios
There are four scenarios to be covered in this guide:
- Test failover: Promoting and Restoring a Target Pod
- Cleanup Test Failover: Demoting and Removing a Target Pod
- Failover: Promoting a Target Pod after Source Pod Demotion
- Re-Protect: Re-Promoting a Source Pod after Target Pod Demotion
The high level process for AVS ActiveDR is identical to scenarios between on-premises VMware environment. The difference is that AVS storage management is simplified and limited to using Run Commands via PureStorage.CBS.AVS
.
The scenario and the test environment for this guide was based on:
- Source: Azure VMware Solution (Version: 7.0.3) and On-Premises VMware environment (Version: 7.0.3)
- Target: Azure VMware Solution (Version: 7.0.3).
Configuration
Prior to following the scenarios below, make sure you have satisfied the prerequisites configuration:
- Connectivity between both environment
- For On-Prem to AVS configured using Site-2-Site VPN, Alternatively Express Route can be used, and connected to AVS using Global Reach.
- For two AVS private cloud in the same region, you can configure AVS Interconnect.
- For two AVS private cloud in different regions, you can configure ExpressRoute Global Reach.
- Deploy and setup Run Commands for use with AVS and CBS, you can follow this guide to setup the integration.
- Initializing AVS and CBS iSCSI Connectivity, you can follow this guide to build and initialize the connectivity.
Test Failover: Promoting and Restoring a Target Pod
ActiveDR was specifically designed to make the test of a recovery (or recovery drill) the same operation as an actual recovery.
The overall process of testing is:
- The target pod gets promoted while source remains promoted and online (No disruptive to source workload). Refer to Promoting an ActiveDR Pod with AVS.
- The
Restore-PCSBVmfsDatastore
gets executed by passing the name of the target pod or volume names. Refer to Restoring a Promoted ActiveDR Pod with AVS. - The VMFS datastore(s) gets mounted on the specified AVS cluster. The next step would be registering and powering on the VMs.
Note: In a testing workflow, the source pod stays online and in promotion state. The source pod continues to replicate writes and object changes to the target array--these changes are not reflected in the target pod while it is promoted.
Cleanup Test Failover: Demoting and Removing a Target Pod
Once testing is complete, cleaning up the recovery would be a target only process, source is not impacted. Once the cleanup process is over, any changes to objects in the target pod will be discarded (new volumes, writes, snapshots, etc).
The process of cleanup the test is:
- Shutdown VMs running on restored datastores, and unregister them.
- Unmount the datastores by executing
Remove-
PCSBVmfsDatastore
see Removing a Promoted ActiveDR Pod with AVS. - Demote the target pod. See Demoting an ActiveDR Pod with AVS.
Failover: Promoting a Target Pod after Source Pod Demotion
If the source pod has been demoted, and you promote the target pod, this process is generally called a "recovery" or a "failover". The purpose here is to bring down the workload on the original “source” site and is being brought up in a new site. The recovery process is:
On the source environment:
- Shutdown VMs running on restored datastores, and unregister them.
- Unmount the datastore.
- If the source is AVS/CBS, execute the remove command. See Removing a Promoted ActiveDR Pod with AVS.
- If the targat is VMware on-premises/FlashArray. See Demoting an ActiveDR Pod in a VMware Environment.
- Demote the source pod. See Demoting an ActiveDR Pod with AVS.
On the target environment:
- Promote the target pod. See Promoting an ActiveDR Pod with AVS.
- Mount the datastore.
- If the target is AVS/CBS, execute the restore command. See Restoring a Promoted ActiveDR Pod with AVS.
- If the targat is VMware on-premises/FlashArray. See Promoting an ActiveDR Pod in a VMware Environment
- Register and power on the virtual machines.
Re-Protect: Re-Promoting a Source Pod after Target Pod Demotion
The last scenario is to re-protect or fallback to the original “source” environment. To archive that, you would need to bring down the target -- demote the target pod, and then bring up the source environment by promoting the pod.
Note: Once you demote the target and promote the source, the replication direction will flip, and the source will be ready to brought up and VMs can be registered ed and powered up.
Appendix
Promote an ActiveDR Pod with AVS
In this environment, the target pod is located on Pure CBS. The promotion can be performed on the CBS GUI or from within the Cloud Manager OVA or any machine has PureStoragePowerShellSDK2
installed on it and can connect and authenticate to CBS.
Restore a Promoted ActiveDR Pod with AVS
Executing the Restore run command can be done via Cloud Manager OVA or any machine has PureStoragePowerShellSDK2
installed on it and can connect and authenticate to CBS.
Remove a Restored ActiveDR Pod with AVS
Before demoting a target pod (part of the test cleanup scenarios), or demoting a source pod mounted to AVS cluster, you have to remove the datastore. Therefore, the process to prepare a VMware environment involves shutting down virtual machines running off of the ActiveDR-protected storage and informing the ESXi hosts that this storage will go away.
Refer to Demoting an ActiveDR Pod in a VMware Environment, for more in-depth information.
Executing the Remove
run command can be done via Cloud Manager OVA or any machine has PureStoragePowerShellSDK2
installed on it and can connect and authenticate to CBS.
The Remove-PCBSVmfsDatastore command will perform two actions:
- Dismounts and removes an existing CBS VMFS datastore from AVS.
- Removes the volume from the CBS host group and deletes the volume in CBS.
Remove-PCBSVmfsDatastore -ClusterName "mycluster" -DatastoreName "myDatastore -AVSCloudName "myAVSCloudName" -AVSResourceGroup "myAVSResourceGroup"
Demote an ActiveDR Pod with AVS
Demoting an ActiveDR pod with AVS can be part of two scenarios:
- Cleanup test recovery: demoting a target pod.
- Recovery or failover: demoting a source pod.
Both scenarios require preparing the VMware environment involves shutting down and unregistering virtual machines, then removing the datastores. For removing steps, check the previous section.
Another requirement prior to demotion is to verify if there is still active I/O going to the pod. If so, examine the workloads and stop them. If they should/cannot be stopped, move the workload to a volume not in the target pod.
The simplest way to ensure the pod is not actively serving I/O is to check the performance statistics.
Navigate to Analysis -> Performance -> Pods, then select the pod.
The promotion can be performed on the CBS UI or from within the Cloud Manager OVA or any machine has PureStoragePowerShellSDK2
installed on it and can connect and authenticate to CBS.
Common Issues and Fixes
Restore Command Failed
Symptom:
Executing the Restore-PCBSVmfsDatastore
returns:
'Failed to re-signature VMFS volume.'
Applies to:
When restoring VMFS datastore located in ActiveDR pods using -PodName "podName"
or -VolumeName "podname::volumename".
Cause:
The pod was not promoted.
Resolution:
For restoring a Pod or Pod::Volume, you have to promote it first. Once pod is promoted it will be available for I/O, and the Restore command could re-signature it.