ActiveDR and Microsoft Solutions
ActiveDR
Overview
ActiveDR, available in Purity//FA 6.0 delivers near zero RPO with continuous replication in an Active-Passive configuration and provides fast recovery and failover time for global disaster recovery. Purity ActiveDR seamlessly protects application data across almost any distance while minimizing both recovery points and recovery times. With single-command failover, intelligent failback, and non-disruptive disaster recovery testing, ActiveDR accelerates your responsiveness to outages. Simplifying disaster recovery and testing without stopping replication means you can test more often and have more confidence in your business-resilience capabilities. ActiveDR has no impact to application latency and DR processes for SQL Server, such as configuration, demotion and promotion, can be automated with the PowerShell SDK. For a deep dive on ActiveDR see the whitepaper: Protect Application Data with ActiveDR in Purity FA//6.
The purpose of this guide is to provide examples of protecting the two most common Microsoft Solutions: SQL Server, and Hyper-V.
For VMware-specific guidance, start with the article: Demoting an ActiveDR Pod in a VMware Environment. At its core, ActiveDR asynchronously replicates volumes that are placed in a Pod, which is a container on the FlashArray, where all contained volumes are replicated as a consistent group. For an example of synchronous replication with automatic failover, see these three ActiveCluster whitepapers that focus on: VMware, Hyper-V and SQL Server.
A Pod configured for ActiveDR will have a source, or promoted Pod, and a destination, or demoted Pod.
The image above shows a healthy Pod where Local Pod refers to the Pod on the logged on FlashArray, and Remote Pod is on the Remote Array.
The order of operation for a failover should be:
- Stop applications, dismount databases, or offline cluster roles, if accessible.
- Offline all disks and cluster disk resources in the source Pod, if accessible.
- Demote the source Pod, if accessible.
- Promote the DR Pod.
- Online all disks and cluster disk resources in the DR Pod.
- Start applications, mount databases, or online cluster roles.
In the case of a real outage, availability can be restored at the DR site very quickly with a small, uncomplicated script. Leverage the Pure Storage PowerShell SDK to promote the DR Pod, online disks, and mount databases or start virtual machines. See example code blocks in the Appendix.
Note: A Pod that is not connected or stretched to a DR Pod, can have existing volumes moved into it. Once a Pod is stretched, new volumes can be created in the Pod, but an existing volume cannot be moved into it.
Failover Setup and Testing
It is recommended to prepare for failover by configuring the servers in the DR site before failing over. Since the DR Pod will be in a demoted state, the Pure Volumes will be read-only. This will allow devices and paths to be discovered and configured reducing the Recovery Time Objective in the event of a failover.
Failover can be tested by simply promoting the DR Pod. This will not interrupt the replication and when testing is completed the DR Pod can be demoted, discarding changes. This means that the same script used to promote the DR Pod and online VM and Databases in a real emergency can be used to test the Pod promotion. See the Test Failover section of the whitepaper: Protect Application Data with ActiveDR in Purity FA//6.
To reverse the replication link, demote the Production Pod, and promote the DR Pod. With the DR Pod promoted, the act of demoting the Production Pod with the skip quiesce option, will automatically start the replication from the DR Pod to the Production Pod.
Note: The act of demoting any Pod creates an .undo-remote snapshot of the Pod that is placed in the destroyed bucket. This snapshot is used to undo a demotion if done accidentally and it will be automatically eradicated in 24 hours. If testing failover more often than once every 24 hours, this destroyed snapshot will need to be manually eradicated before the Pod can be demoted. This .undo-remote Pod can be cloned so that any data written but not replicated can be recovered.
Hyper-V
While Hyper-V can be run as a stand-alone Windows Server, this is not a common practice for production deployments. In production, to provide High Availability, server nodes are usually joined together into a Microsoft Cluster using Failover Cluster Manager or PowerShell. A Microsoft Cluster can have two types of disk resources; Cluster Disk Resources, and Cluster Shared Volumes (CSV).
The image above shows both a Cluster Disk Resource assigned to a File Server Role, and three Cluster Disk Resources configured as Cluster Shared Volumes (CSV) in Failover Cluster Manager.
A Cluster Disk Resource is a disk that has been added to the cluster, as a disk resource. A Cluster Disk Resource, like any standard cluster resource, can only be online on one cluster node at a time. Some examples of a Cluster Disk Resource use case are: a disk quorum, a File Server Role, and an Always On SQL Failover Cluster Instance Role.
A CSV is a clustered disk that is promoted or assigned into a Cluster Shared Volume. A CSV is assigned a mount point under "C:\ClusterStorage" and is accessible to every node of the cluster. That enables virtual machines to be live migrated independently of each other since many virtual machines can be placed on a single CSV and each virtual machine can be brought online on any of the physical cluster nodes.
Note: It is important that Cluster Disks and Cluster Roles, which includes highly available virtual machines, are taken offline before demoting the Pod.
When configuring ActiveDR with Hyper-V using Virtual Machine's on CSVs, the DR Cluster, or standalone Hyper-V Server should be preconfigured and tested before an outage occurs to reduce the Recovery Time Objective in the event of an outage. The DR Cluster can be online, or kept offline in order to save power. If the DR Cluster is online, the Virtual Machines and their CSVs should be offline when not testing failover.
SQL Server
SQL Server can be configured as an Always On Failover Cluster Instance, an Always On Availability Group, or Standalone. While ActiveDR can be implemented with many SQL Server configurations, the two most common use cases are replicating the entire VM and replicating just the user data. Stretching a SQL Always On Availability Group and using ActiveDR is not supported since you would be replicating the data at both the storage and application layers simultaneously. Replicating a VM is more complicated, since the networking configuration is stored inside the VM. If the network is not stretched, user orchestration is required to change the networking subnet or VLAN. This can be as simple as a script, or more complex if the network must be changed before the application starts. In most cases, an Always On Availability Group will be better served by using replication at the SQL layer, rather than at the storage layer. Adding a node to the cluster at the DR site, then an asynchronous copy of the database on that node, is the simplest configuration.
The user data means non-system database and transaction log files. This is similar to Cloning SQL Server Databases using FlashArray. Once you have a snapshot of Pure Volumes that contain databases and transaction logs, you can copy them to new volumes, or overwrite existing volumes. ActiveDR is similar. When you promote a demoted copy, replication finishes, or in the event that the original promoted side is unavailable, the most recent point in time, crash-consistent version of the volume will become writable at the DR site.
Standalone & Always On Failover Cluster Instance
Make sure that all of the non-system databases are placed on volumes that are in the same Pod. Volumes which contain system databases and TempDB should not be placed in the Pod.
Always On Availability Group
An Availability Group is a special case since it can have many synchronous and asynchronous copies of the databases in the Availability Group on nodes throughout the cluster. It is more common to stretch the Availability Group to a SQL Server in the DR site, using an asynchronous copy of the Availability Group. Should your use case require the storage to handle the replication it is more efficient to place only one copy of the databases of the Availability Group in a Pod to be replicated and the SQL instance in the DR site cannot be part of the production SQL Availability Group cluster.
Appendix
CimSession
Many of the following examples will work if run directly on the Hyper-V or SQL Server. When more than one Server is involved, or if the automation is run from a remote management server, some of the PowerShell cmdlets will need to be properly modified to connect to the correct endpoint. One way to do that is to create a CIMSession and then connect to it to perform operations such as changing a disk or Cluster Resource state. See New-CimSession for details on creating a CIM Session. Many of the PowerShell cmdlets, such as Set-Disk, referenced below have a -CimSession switch to enable the connection to a remote session.
CSV Online & Offline
In the above image, a CSV is about to be brought offline in Failover Cluster Manager. An offline Cluster Disk Resource can be brought online with "Bring Online" after right-clicking the disk in Failover Cluster Manager.
Stop-ClusterResource -Name "Cluster Disk 5"
Start-ClusterResource -Name "Cluster Disk 5"
In the above code block, the same CSV is about to be brought offline or online in PowerShell.
VM Online & Offline
In the above image, the VM, Exch02 was just Shut Down, by right-clicking it, and selecting Shut Down in Failover Cluster Manager. It can be started, by right-clicking it, and selecting Start in Failover Cluster Manager.
Stop-ClusterResource -Name "Virtual Machine Exch02"
Start-ClusterResource -Name "Virtual Machine Exch02"
In the above code block, the VM Exch02 can be stopped or started in PowerShell. Ensure that the Failover Cluster Module for Windows PowerShell is installed.
Physical Disk Online & Offline
A physical disk that is not a cluster disk resource can be brought online or offline in Disk Management, or PowerShell. A Cluster Disk Resource or CSV does not require this step as the act of taking the cluster resource online or offline will also change the state of the disk.
In the above image, when the disk is right-clicked, it can be brought online, or offline in Disk Management.
Set-Disk -Number 6 -IsOffline $true
Set-Disk -Number 6 -IsOffline $false
In the above code block, disk number 6 is brought online or offline with PowerShell.
SQL Database Online & Offline
In the above image, using SQL Server Management Studio, the database 'dbprod' is above to be taken offline. An offline database can be brought online, by right clicking the database, selecting Tasks, and then selecting Bring Online.
Invoke-Sqlcmd -InputFile c:\ps\repmount.sql -QueryTimeout 3600 -ServerInstance 'ch2-barkz-03\qstance'
Invoke-Sqlcmd -InputFile c:\ps\repdismount.sql -QueryTimeout 3600 -ServerInstance 'ch2-barkz-03\qstance'
In the above code block, a database is brought online or taken offline using the SQL Server PowerShell Module.
Pod Demotion
Before demoting a Pod, it is important that all Virtual Machines, or other Cluster Roles, and SQL Databases, or other applications, are shut down and offline. Virtual Machines will not gracefully shut down and SQL Databases will be marked suspect if the underlying storage is ungracefully forced into read-only mode.
In the above image, a disk that is part of a demoted Pod is now Read Only. It is recommended to offline all disks contained in a Pod before demoting the Pod.
In the above image, the dbprod-dr database is marked (Suspect) because it was not offline before the Pod was demoted.
In the above image, Pod qpodprod is about to be demoted. Be sure that all disks and applications are offline.
Pure Storage PowerShell SDK 1.17 (PureStoragePowerShellSDK)
New-PfaCLICommand -EndPoint 10.21.201.57 -UserName pureuser -Password $57 -CommandText "purepod demote qpodprod --quiesce"
In the above code block, using v1.17 of the Pure Storage PowerShell SDK, the command to demote the Pod qpodprod is issued.
Pure Storage PowerShell SDK 2.2 (purestoragepowershellsdk2)
Update-Pfa2Pod -Array $pfa57 -Name qpodprod -RequestedPromotionState "demoted" -Quiesce $true
In the above code block, using v2.2 of the Pure Storage PowerShell SDK, the command to demote the Pod qpodprod is issued.
Pod Promotion
In the above image, Pod qpoddr is about to be promoted.
Pure Storage PowerShell SDK 1.17 (PureStoragePowerShellSDK)
New-PfaCLICommand -EndPoint 10.21.201.130 -UserName pureuser -Password $130 -CommandText "purepod promote qpoddr"
In the above code block, using v1.17 of the Pure Storage PowerShell SDK, the command to promote the Pod qpoddr is issued.
Pure Storage PowerShell SDK 2.2 (purestoragepowershellsdk2)
Update-Pfa2Pod -Array $pfa130 -Name qpodprod -RequestedPromotionState "promoted"
In the above code block, using v2.2 of the Pure Storage PowerShell SDK, the command to promote the Pod qpoddr is issued.
Script Examples
For automation, please check out the Pure Storage PowerShell SDK. Version 1.x (1.17) and the recently shipped 2.x (2.2), correspond to the version of the Rest API that is supported. The v2 SDK has cmdlets to initiate a Pod promote/demote, while the v1 SDK requires the use of the New-PfaCLICommand cmdlet to pass a command to the CLI.
Demotion Script Example SDK 1.17
# The disk resources of any applications or cluster roles on the Pod should be brought offline after the application is stopped.
# Before demoting the Pod, offline VMs, SQL DBs, and disks
# Offline DB dbprod-dr in the DR site
Invoke-Sqlcmd -InputFile c:\ps\repdismount.sql -QueryTimeout 3600 -ServerInstance 'ch2-barkz-03\qstance'
# Offline physical disk housing the SQL DB dbprod-dr in the DR site
Set-Disk -Number 6 -CimSession $cimsess -IsOffline $true
# Offline VM Exch02-dr in the DR site, requires Failover Cluster module
Stop-ClusterResource -Name "Virtual Machine Exch02"
# Offline CSV disk housing VM Exch02-dr in the DR site
Stop-ClusterResource -Name "Cluster Disk 5"
# Connect to Prod FlashArray #57 For examples on how to encrypt the password to a text file see ConvertTo-SecureString
$57 = Get-Content "C:\ps\57.txt" | ConvertTo-SecureString
# -IgnoreCertificateError should not be used when proper certificate configuration is used
$pfa57 = New-PfaArray -EndPoint 10.21.201.57 -UserName pureuser -Password $57 -IgnoreCertificateError
# Demote the Pod
New-PfaCLICommand -EndPoint 10.21.201.57 -UserName pureuser -Password $57 -CommandText "purepod demote qpodprod --quiesce"
Promotion Script Example SDK 1.17
# The disk resources of any applications or cluster roles on the Pod should be brought online after the Pod is Promoted.
# Connect to Prod FlashArray #57 For examples on how to encrypt the password to a text file see ConvertTo-SecureString
$130 = Get-Content "C:\ps\130.txt" | ConvertTo-SecureString
# -IgnoreCertificateError should not be used when proper certificate configuration is used
$pfa130 = New-PfaArray -EndPoint 10.21.201.130 -UserName pureuser -Password $130 -IgnoreCertificateError
# Promote the Pod
New-PfaCLICommand -EndPoint 10.21.201.130 -UserName pureuser -Password $130 -CommandText "purepod promote qpoddr"
do {
Write-Host "Waiting for Promotion"
Start-Sleep -Milliseconds 500
$test = New-PfaCLICommand -EndPoint 10.21.201.130 -UserName pureuser -Password $130 -CommandText "purepod list qpoddr"
} while ($test | select-string -pattern "promoting")
# Online physical disk housing the SQL DB dbprod-dr in the DR site
Set-Disk -Number 6 -IsOffline $false
Set-Disk -Number 6 -IsReadOnly $false# Online DB dbprod-dr in the DR site
Invoke-Sqlcmd -InputFile c:\ps\repmount.sql -QueryTimeout 3600 -ServerInstance 'ch2-barkz-03\qstance'
#In the case of a Hyper-V Host, first online the Cluster Disk Resource, then any VMs on that CSV.
# Online CSV disk housing VM Exch02-dr in the DR site
Start-ClusterResource -Name "Cluster Disk 5"
# Online VM Exch02-dr in the DR site, requires Failover Cluster module
Start-ClusterResource -Name "Virtual Machine Exch02"
Demotion Script Example SDK 2.2
#The disk resources of any applications or cluster roles on the Pod should be brought offline after the application is stopped.
#Before demoting the Pod, offline VMs, SQL DBs, and then disks
#Offline DB dbprod-dr in the DR site
Invoke-Sqlcmd -inputfile c:\ps\repdismount.sql -QueryTimeout 3600 -ServerInstance 'ch2-barkz-03\qstance'
#Offline physical disk housing the SQL DB dbprod-dr in the DR site
Set-Disk -Number 6 -CimSession $cimsess -IsOffline $true
#Offline VM Exch02-dr in the DR site, requires Failover Cluster module
Stop-ClusterResource -name "Virtual Machine Exch02"
#Offline CSV disk housing VM Exch02-dr in the DR site
Stop-ClusterResource -name "Cluster Disk 5"
#Connect to the FlashArray with SDK 2.2. For OAUTH setup, see the Blog: Introducing the Pure FlashArray PowerShell SDK Version 2
#For examples on how to encrypt the password to a text file see ConvertTo-SecureString
#-IgnoreCertificateError should not be used when proper certificate configuration is used
$purepass = Get-Content "C:\ps\purepass.txt" | convertto-securestring
$pfa130 = Connect-Pfa2Array -EndPoint 10.21.201.130 -UserName pureuser -Password $purepass -IgnoreCertificateError
#demote the Pod
Update-Pfa2Pod -Array $pfa57 -Name qpodprod -RequestedPromotionState "demoted" -Quiesce $true
Promotion Script Example SDK 2.2
#The disk resources of any applications or cluster roles on the Pod should be brought online after the Pod is Promoted.
#Connect to the FlashArray with SDK 2.2. For OAUTH setup, see the Blog: Introducing the Pure FlashArray PowerShell SDK Version 2
#For examples on how to encrypt the password to a text file see ConvertTo-SecureString
#-IgnoreCertificateError should not be used when proper certificate configuration is used
$purepass = Get-Content "C:\ps\purepass.txt" | convertto-securestring
$pfa130 = Connect-Pfa2Array -EndPoint 10.21.201.130 -UserName pureuser -Password $purepass -IgnoreCertificateError
#Promote the Pod
Update-Pfa2Pod -Array $pfa130 -Name qdr2 -RequestedPromotionState "promoted"
#Get promotion status and stall until it is promoted.
#You do not want to online disks that have not been changed from READONLY status yet.
do{
Write-Host "Waiting for Promotion"
Start-Sleep -Milliseconds 500
$test = (get-pfa2pod -array $pfa130 -name qdr2).promotionstatus
} while ($test |select-string -pattern "promoting")
Write-Host "Pod Promoted"
#Online physical disk housing the SQL DB dbprod-dr in the DR site
Set-Disk -Number 8 -CimSession $cimsess -IsOffline $false
Set-Disk -Number 8 -CimSession $cimsess -IsReadOnly $false
#Online DB dbprod-dr in the DR siteInvoke-Sqlcmd -inputfile c:\ps\repmount.sql -QueryTimeout 3600 -ServerInstance 'ch2-barkz-03\qstance'
#In the case of a Hyper-V Host, first online the Cluster Disk Resource, then any VMs on that CSV.
#Online CSV disk housing VM Exch02-dr in the DR site
#Start-ClusterResource -name "Cluster Disk 5"
#Online VM Exch02-dr in the DR site, requires Failover Cluster module
#Start-ClusterResource -name "Virtual Machine Exch02"
T-SQL Queries to Online & Offline a Database
ALTER DATABASE [dbprod-dr] SET OFFLINE WITH ROLLBACK IMMEDIATE
GO
In the above code block, the TSQL query to offline the database referenced in the invoke-sqlcmd SQL Server PowerShell cmdlet as repdismount.sql. For information on how to safely offline a production database, see ALTER DATABASE SET options.
ALTER DATABASE [dbprod-dr] SET ONLINE
GO
In the above code block, the TSQL query to online the database referenced in the invoke-sqlcmd SQL Server PowerShell cmdlet as repmount.sql.