Cross-Region Azure VM Disaster Recovery with Azure Site Recovery | Pure CBS on Azure
This guide walkthrough the process of protecting Azure VMs between two regions using Azure Site Recovery (ASR) Recovery Plans and Pure Cloud Block StoreTM (CBS). It focuses on orchestrating the failover and failback operations by leveraging Automation account Runbooks.
Introduction
Planning your business disaster recovery solution can be an expensive and complex process. The requirement around brining your critical application up and running as quick as possible (RTO) with the minimum data loss (RPO) can determine how your business archives continuity, and it is what matters the most from customers prospective.
For those organizations who decided to migrate and run their mission-critical applications on the cloud, Pure Cloud Block Store offers built-in protection against outages of availability zones and regions. It enhances the resiliency of the native cloud storage replication by asynchronously or synchronously sending data changes to other regions, maintaining lower bandwidth, reducing egress costs and eliminating data silos with seamless data mobility.
By integrating what Pure Cloud Block Store can offer with Azure Site Recovery, organizations unlock the ultimate orchestrated DR solutions to protect their applications on Azure Cloud. The solution in this guide simplifies cross-region protection of Azure VMs by achieving the following objectives:
- Protecting a primary cloud environment with disaster recovery to another region.
- Integration with Azure Site Recovery for automation and orchestration of failover and failback capabilities.
- Lowering the RPO/RTO with CBS replicating data volumes efficiently with as low as 5 min async periodic replication scheduling.
- Mitigating ransomware with Pure SafeMode immutable snapshots.
- Bringing enterprise-grade storage features into Azure.
Solution Design
The solution design is based on four main components:
Azure Site Recovery (ASR)
ASR is an Azure service which offers a Disaster Recovery as a Service (DRasS). It manages and orchestrates the replication of VM workload into Azure from on-premises, or between Azure Regions and Availability Zones.
Pure Cloud Block Store (CBS)
CBS provides seamless data mobility with simple, efficient replication. CBS will be deployed in two regions to provide storage volumes for VMs, and also to protect the data volumes between the regions using protection groups policies that replicate data leveraging the built-in Asynchronous replication capabilities.
CBS replication provides a non-zero (5 minutes or more) volume replication schedule, which delivers a 10 minute or longer recovery point objective that minimizes the impact of replication link latency on production IO and applications.
ASR Recovery Plans
Recovery plans are a part of ASR features; they define a step-by-step process for VM failover. The steps are either Pre action or Post action, and can be either manual action or a script to execute steps during a failover. A manual action prompts the administrator running the plan to take an action, such as checking a service or changing a setting. A scripted action uses Azure Automation to automate the process, and this what will be using in the solution.
Azure Automation
Azure Automation helps with orchestrating and integrating automation with other Azure or third party services. This can be archived by passing configuration scripts into Azure Automation Runbook. A Runbook can be triggered/tested manually, or automatically by other Azure service, as in this case by Recovery Plans customized actions.
The diagram below shows the high-level simplified solution architecture. VM workload runs in the main region and protected/replicated to the secondary region. The DR workflow consist of OS disks protected by ASR, and data disks protected by CBS. The backed by Automation account has two runbooks. The first runbook is triggered by the ASR recovery plan; then the second is triggered by the first one and runs PowerShell configuration scripts on a hybrid worker against CBS. A hybrid worker is an alternative environment to run the runbook script, and it is configured with VM that has connectivity to CBS management interfaces.
Prerequisites and Basic Setup
Before starting the walkthrough, make sure you have completed or prepared the following steps:
- Deploy Cloud Block Store in two regions. Please check Cloud Block Store on Azure implementation guide.
- Deploy jump/bastion VM to connect to CBS instance and other Azure VM app workload.
- Provision Azure VM with volume(s) and apply the iSCSI and MPIO best practices. Please review Host Management for Cloud Block Store.
- Configure network connectivity between the two regions using VNET Peering. Please check Manage virtual network peering.
- Create a Recovery Services vault in the secondary region which you want to replicate VMs to. Please follow Azure VM disaster recovery to Azure.
Importance Consideration and Limitations
This solution works for Windows based virtual machines only. Pure Storage is currently working on Linux support.
- The automation script in this solution works by matching Azure VM names to the provisioned object names in CBS management. Therefore, for Step 3 of the Prerequisites section, you must keep the naming of the created Hosts and Volumes on CBS management same as the Azure VM names.
- In order for this solution to work seamlessly in an automated fashion for both failover and failback scenarios, it must perform the first failover test to the secondary region manually and configure the protected VMs with the second CBS iSCSi target. This configuration will remain a part of the VM OS configuration and during failover drill the VM can auto-connect to the iSCSI target. Follow Host Management for Cloud Block Store and repeat Step 3 of the Prerequisites section.
- Check Operating System support matrix before adding Azure VM. Not all Linux kernel versions are supported with ASR mobility agent. Check Azure to Azure support matrix.
Solution Walkthrough
After all the prerequisites are in place, follow this walkthrough to set up the solution components.
Setup Automation Account: Runbooks and Hybrid Worker
Create Automation Account
-
In Azure Console, navigate to Automation Accounts and click +Create.
- For Region, select a different region than the main region, as the account can not reside within the same protected region.
- Under Managed Identities tab, select System assigned.
- From the Network tab, select Public.
- Last step is to Review + Create.
Create Runbook
- On the left panel of the created automation account, click Runbooks. Then create two runbooks with the following requirement:
- Name: Runbook 1 for Recovery Plan Post Actions. Type: PowerShell. Version: 5.1.
- Name: Runbook 2 for CBS Actions. Type: PowerShell. Version: 5.1.
- After the runbooks are created, access each runbook and click on Edit. Copy and paste the following scripts ASR-plan-actions-runbook, CBS-actions-runbook, to the runbooks, then click Publish.
- The CBS-actions-runbook will be visited in later steps to edit the variable section after configuring CBS replication.
Create Hybrid worker
For the requirement of hybrid worker, Please check Deploy an extension-based Hybrid Runbook Worker.
- On the left panel of the created automation account, click Hybrid worker groups.
- Select + Create hybrid worker group.
- From the Basics tab, in the Name text box, enter a name for your Hybrid worker group.
- For the Use run as credential. You can select No, and the hybrid extension will be installed using the local system. You can also select Yes, then you need to add a specific system credentials.
- Next, select Add machines to go to the Add machines as hybrid worker page. Select VM using checkbox, then click Add, and advance to the Review + Create.
Note: The selected hybrid worker VM must have network access to the VNET where CBS resides. As a best practice, create a separate VNET (Connectivity and Tooling VNET as shown in the diagram in the section above) then use VNET peering to connect to both CBS VNET.
Setup ASR Replication
Enable VM Replication
- In Azure Console, navigate to the created Recovery Service vault, then click +Replicate.
- Follow the steps under Enable Azure to Azure replication to add the designated VMs to be protected between the regions.
- Protecting the OS disk on the VMs might take a couple of minutes, but once the VMs are in protected state, move to the next step.
Create Recovery Plan
- In the Recovery Services vault, select Recovery Plans (Site Recovery), then click +Recovery Plan.
- In Create recovery plan, specify a name for the plan. Choose the source region (The VM location which replication is enabled in the previous steps), the target region where the recovery vault resides.
- Select Resource Manager for the deployment model.
- Next tab under Select items virtual machines, select the machines that you want to add to the plan. Then click OK. The machines will be by default added to group1.
Add a Script Action to Recovery Plan Group
- In Recovery Plans, click the created plan, then Customize.
- Click on the ellipsis at the left of the group, select Add post action.
- In Insert action, there are two options, Manual action or Script, select Script.
- Specify the Azure Automation Account created previously, and select the appropriate Azure Runbook Script.
Setup CBS Replication
The following steps are performed by executing the CLI commands against CBS arrays via jump/bastion VM. You can follow this guide for similar setup using the GUI.
Configure Asynchronous Replication
- On the CBS-DR array, get the connection key.
purearray list --connection-key Connection Key: xxxxxxxx-yyyy-zzzz-aaaa-bbbbbbbbbbbb
- Return to the CBS-Main array, and create the connection.
purearray connect --management-address xx.xx.xx.xx --type async-replication --replication-transport ip --connection-key Enter the connection key of the target array: xxxxxxxx-yyyy-zzzz-aaaa-bbbbbbbbbbbb
Create Protection Group
Before starting this section's steps, make sure you have followed the important consideration section step 2. This means each Azure VM has a Host and a Volume object provisioned on each CBS, on both regions. The main goal is to protect the data in both directions, The following steps will be conducted for both CBS.
- On the CBS-MAIN array, create protection group, add hosts, add target which is the second CBS. Once executed, it creates Source Protection Group, and also a Target Protection Group on the target CBS array.
purepgroup create --hostlist VM01,VM02 --targetlist CBS-DR AzureVMs-PG
You have the option to either add a host group if you have already grouped your hosts or a list of volumes.
purepgroup create [--hgrouplist HGROUPS | --hostlist HOSTS | --vollist VOLS] --targetlist CBS-Array-Name PGName
- Access CBS-DR, and repeat step 1 by creating the protection group, and adding host or volumes to it.
purepgroup create --hostlist VM01,VM02 --targetlist CBS-Main AzureVMs-PG
Configure Replication Policy
The following steps attach replication policy for the created protection group, and enable the replication.
- On the CBS-MAIN array, schedule and enable replication by executing the below commands.
purepgroup schedule -- replicate-frequency 5m PGName #This replicates every 5 minutes. purepgroup retain --all-for 5d PGName #This retians all snapshots of target for 5 days purepgroup retain --per-day 8 --days 5 AzureVMs-PG PGName #This retains 8 snapshots per day for additainla 5 days after all-for duration purepgroup enable --replicate PGName #This enable the replication
- Repeat the same previous steps on the CBS in the second region.
You can do the same replication scheduling with the GUI management:
Add the Required Variables to Runbook Script
- Get the management IP you have been using to login to the management GUI or CLI.
- On both CBS, use GUI to access Protection > Protection Group. Then under Target Protection Group, copy the name of the target protection group. The same can be achived using
purepgroup listobj
. - In Azure Console, navigate to Automation account > Runbooks > CBS-action-runbook, then click Edit. Modify the variable block as shown below with IP addresses and Target Protection Group on each CBS.
Run Failover and Re-protect
In order to run failover to the DR/secondary region, follow this steps:
- In Azure Console, navigate to Recovery Vault > Recovery Plans (Site Recovery), then select the created recovery plan.
- Click Failover, then choose a recovery point for the VM OS disk and select shut down machines.
- Follow the failover procedure by using the job notification, then access the VM(s) on the second region.
- Once the failover is finished the machines are up and running, perform Re-protect to the recovery plan, this will instantiate OS disk replication to the main region. Note: No actions are required on the CBS array level.
Run Failback
Failing back to the main region is a straightforward process and similar to the failover steps. Once all machines are reprotected, you can perform failover action to the main region by navigating to Recovery Plan, then select the plan, click Failover.