vVols Deep Dive: Array Based Replication with vVols
vVols Replication Terminology
These terms are fundamental to how the APIs and integration with vVols replication will work.
Name/Concept | Explanation |
---|---|
Replication Provider |
A VASA provider that supports VASA version 3 and array based replication-type features. This will inform VMware of replication features, configure VMs with replication, and inform VMware of compliance. |
Storage Capabilities |
The array based replication features offered up by a replication provider. What these are is very vendor specific. This can be replication interval, consistency groups, concurrency, retention, etc. |
Storage Policy |
A collection of VASA capabilities; assembled together by a user and assigned values. |
Fault Domain |
This is an available target in the replication group. In other words, each fault domain is an array that you can fail VMs in that replication group over to. Fault domain = Array. |
Source Replication Group |
A unit of failover for replicated vVol VMs. Individual VM failover is not possible (unless it is the only VM in the replication group). Replicated vVols are put into a source group. Every source group has a respective target group on each replication target (fault domain). The source replication group will be associated to a FlashArray protection group on the source FlashArray. e.g. |
Target Replication Group |
For every fault domain specified in a source replication group, there is a target replication group. Test failovers, failovers, and reprotects are executed against a target replication group. If there is a DR event, it is possible that only the target group is left. It is designed to withstand the failure of the source. The target replication group will be associated to a target protection group on the target FlashArray. e.g. |
With these terms covered, here is a visual representation of what these terms correlate to. In the illustration below there are three FlashArrays, with FlashArray-A replicating to FlashArray-B and FlashArray-C.
![]() |
VMware's vSphere user guide covers vVols replication groups and fault domains in some additional detail. Please refer to that user guide if additional context is desired.
vVols Replication Operations
With the terminology foundation laid, it's time to dig into the specific APIs that drive vVols replication and management. With each API call the API's operation, purpose and use cases will be covered.
API Call | Operation and Explanation |
---|---|
SyncReplicationGroup_Task() |
Operation: Synchronize Replication Group Purpose: To tell a replication group to synchronize from its source group. You can specify a name to also indicate a point-in-time to reference later. Use Case: Useful for creating quiesced or special failover points. You issue this against a target replication group—it will then return a task. When the replication to that fault domain is complete, the task will return completion. What VASA Does: VASA on the Target FlashArray triggers an on demand snapshot to be replicated from the source FlashArray to the Target FlashArray. For example: This allows the command to correctly be issued against the target replication group. VASA will add a suffix to these snapshots named "VasaSyncRG" followed 6 random letters/numbers. This is an async task and VASA will return a task ID to vCenter for this async task. |
TestFailoverReplicationGroupStart_Task() |
Operation: Test Failover Start Purpose: This initiates a target replication group to present writable vVols of the VMs in a replication group to the target side. This can be issued without a point-in-time (using the latest version) or a specified point-in-time. Use Case: Testing your recovery plans without affecting production. A test is run against a target replication group. This changes the target replication group from the TARGET state into the INTEST state. What VASA Does: VASA on the Target FlashArray starts the workflow of grabbing the most recent snapshot for the target replication group. Then, VASA copies out all of the volumes from the snapshot on the target FlashArray. The correct volume groups will be created and the config and data vVols will be placed in the correct vgroups. In addition to copying out the volumes from the snapshot, VASA will create a pgroup with the same replication schedule/settings as the source replication group if this is the first time that the replication group has had a testFailover or Failover ran. If there has previously been testFailover or Failover ran on the target Replciation Group VASA will attempt to reuse the pgroup that was created in that process. If that pgroup was destroyed or if additional unknown volumes were added to the pgroup VASA will treat the testFailover as the 1st one and create a new pgroup. In the event that the testFailover is being ran after a successful Failover and re-protect, VASA will first attempt to reuse the vgroups, volumes and pgroup when copying out the volumes from the snapshot. In the event that the pgroup has been destroyed or has been edited, VASA will create a new pgroup and will not reuse the existing vgroups and volumes. After the volumes are copied out, all of the metadata associated with these vVol VMs are updated for the new storage container and VASA provider as needed. The files are then accessible from the target vVol storage container. Once the job is complete, the updated .vmx filepaths are returned. The VMs must be registered as part of an independent task from this API. |
Here is a look at browsing the vVol Datastore on the Target vCenter/FlashArray after a testFailoverStart has completed. Here is a look at browsing the vVol Datastore on the Source vCenter/Array. We can see the Files and paths for the source compared to how they show up on the Target. |
|
TestFailoverReplicationGroupStop_Task() |
Operation: Test Failover Stop Purpose: This ends a test failover and cleans up any volumes or created VMs on the target side. Use Case: Ending a test failover and cleaning up the storage provisioned for it. A test stop is run against a target replication group that is in the INTEST state. This reverts it back to the TARGET state. What VASA Does: VASA will destroy and eradicate the copied out volumes and vgroups that were created for the test. Note that a volume can not be destroyed if it is connected to a host. If existing binds exist, the stop task will fail. Once all the volumes and vgroups have been destroyed and eradicated the replication group be updated back to Target State. |
Here is a look at the vVol Datastore on the Target after the testFailoverReplicationGroupStop has completed: Notice that none of the VM Files exist. Please note that the VMs should be powered off and unregistered before running this API. |
|
PromoteReplicationGroup_Task() |
Operation: Promote Replication Group Purpose: In the case of a disaster during a test recovery, this allows you to specify VMs that are in the test state to become the production VMs. Use Case: Loss of source site VMs during a test recovery. This is executed against a target replication group in the INTEST state and converts the state to FAILEDOVER. What VASA Does: When testFailoverReplicationGroup is run against a Target Replication Group, the Replication Group state is changed from Target to INTEST. When running a PromoteReplicationGroup on an INTEST Replication Group VASA will update the state of the Replication Group to FAILEDOVER. This then allows the ReverseReplicationGroup call to be issued to update it to Source. |
PrepareFailoverReplicationGroup_Task() |
Operation: Prepare Failover Purpose: This synchronizes the replication group to a fault domain. The target replication group will no longer accept syncReplicationGroup operations. Use Case: Doing a final synchronization before a failover. This is issued to the source replication group, so it is really only useful for planned migrations. It is not recommended for a test failover, just actual failovers. |
FailoverReplicationGroup_Task() |
Operation: Failover Purpose: To run a migration or a disaster recovery failover of VMs in a replication group. Use Case: Disruptively moving VMs in a replication group from one array to another for a DR event or a planned migration. This is run against a target replication group and changes the state from TARGET to FAILEDOVER. What VASA Does: This process is similar to the testFailover, in that the most recent snapshot (or PiT if specified) has the volumes copied out and updated on the target FlashArray (fault domain). The difference here is that the target replication group has it's state updated to FAILEDOVER and not INTEST. Meaning that a ReverseReplicationGroup can be issued once the Failover task has completed. In addition to copying out the volumes from the snapshot, VASA will create a pgroup with the same replication schedule/settings as the source replication group if a testFailover has not been ran and this is the first time a Failover has been ran. If there has previously been a testFailover or Failover run on the target Replication Group, VASA will attempt to reuse the pgroup that was created in that process. If that pgroup was destroyed or if additional unknown volumes were added to the pgroup, VASA will treat the Failover as the first time and create a new pgroup. VASA does not destroy or eradicate the source volumes, vgroup and pgroup for the VMs that are failed over as part of this replication group. In the event that a testFailover is ran, those volumes, vgroups and pgroup will be reused and then destroyed when the testFailover is cleaned up. If a Failover is ran before a testFailover is ran, then VASA will attempt to reuse the vgroups, volumes and pgroup when failing over to the target. Please note that the API does not power off or unregister the VMs at the source vCenter/FlashArray; nor do the recovered VMs get registered in the recovery vCenter Server. This must all be done by the end user. |
ReverseReplicateGroup_Task() |
Operation: Reprotect Purpose: Makes a failed over group a source that replicates back to the original source. Use Case: Ensures that your VMs are protected back to the original site. Run against a FAILEDOVER replication group and changes its state to SOURCE. This is not necessarily required—you can also just apply a new storage policy to the VMs to protect them. This is only needed to reset the state of the original target group. What VASA Does: VASA will initiate a snapshot replication from the pgroup that the copied out volumes have been added to. Once this snapshot has completed, the replication schedule is enabled and the replication group's state is updated from FAILEDOVER to Source. The ReverseReplicationGroup does not re-apply storage policies or assign replication groups in SMS/vCenter. In order to complete the re-protect process, the end user will need to reset the storage policy to vVols No Requirements and then apply the storage policy on the new protected site and the correct replication group. |
![]() |
|
QueryReplicationGroup() QueryPointInTimeReplica() QueryReplicationPeer() |
Operation: Queries Purpose: Retrieve state of replication environment. Use Case: Used to script/detect state of replication, available point-in-times, and status of a group. These can be run against most types of groups to find out the state of replication. What VASA Does: For each query, VASA will check the metadata and tags for each of the associated requests. Then returns the results of the request. |
Please pay close attention to the notice below:
Regarding Management Path changes:
API calls for FailoverReplicationGroup and TestFailoverReplicationGroup do not register VMs, power them on or change networks. These are still required.
The vVols replication management APIs just make sure the VM storage is ready on the target site.
The VMs appear on the target storage then can be registered and configured as needed.
Each of these APIs can be leveraged with the vCenter MOB. However, that's not the most optimal way to manage a vVols ecosystem. vRealize Orchestrator, PowerCLI and Site Recovery Manager (8.3+) all integrate with these APIs to support vVols Array Replication workflows.
vVols Replication PowerCLI Commands
Here are the PowerCLI commands that relate to managing vVols Array based replication with storage policies. Each command has a brief explanation about what the command does. If further information is needed, please run get-help and then the name of the command that you want more information about.
Name/Concept | Explanation |
---|---|
Get-SpbmFaultDomain | Retrieves fault domains based on name or ID filter - Prints the Name of the FaultDomain and the VASA Provider managing it. |
> Get-SpbmFaultDomain Name VasaProvider ---- ------------ sn1-m20r2-c05-36 sn1-m20r2-c05-36-ct0 sn1-x70-b05-33 sn1-x70-b05-33-ct0 sn1-x70-c05-33 sn1-x70-c05-33-ct0 |
|
Get-SpbmReplicationGroup | Retrieves the replication groups queried from the VASA Providers - Prints the Name and State of the replication groups. |
> Get-SpbmReplicationGroup Name ReplicationState ---- ---------------- sn1-x70-b05-33:vVols-Replication Source sn1-x70-b05-33:x70-1-policy-ac1-light-001 Source |
|
Get-SpbmReplicationPair |
Retrieves the relation of replication groups in a pair of source & target replication group. |
> Get-SpbmReplicationPair Source Group Target Group ------------ ------------ sn1-x70-b05-33:vVols-Replication 395a60c2-5803-40be-95b7-029b1b3ffc3e:62 sn1-x70-c05-33:x70-2-policy-ac2-light-001 35770c78-edaf-4afc-9b75-f3fb5c2acee9:9 |
|
Get-SpbmPointInTimeReplica | Retrieves the point in time replicas (array based snapshots) for a provided replication group. Scheduled pgroup snapshots will not have a name or description. |
> Get-SpbmPointInTimeReplica Name CreationTime ReplicationGroup ---- ------------ ---------------- 8/25/2020 3:34:25 PM 395a60c2-5803-40be-95b7-029b1b3ffc3e:62 PiT-1 8/25/2020 3:33:48 PM 395a60c2-5803-40be-95b7-029b1b3ffc3e:62 |
|
Get-SpbmStoragePolicy | Retrieves the Storage Policies from the connect vCenter Servers. |
> Get-SpbmStoragePolicy Name Description Rule Sets ---- ----------- --------- Pure-Demo {(com.purestorage.storage.policy.PureFlashArray=True) AND (com.purestorage.storage.replication.RemoteReplicationInterval=00:05:00… VVol No Requirements Policy Allow the datastore to determine the best placement strategy for storage objects FlashArray Snap 1 DAYS FlashArray Storage Policy. Snapshot every 1 Days, retained for 7 Days. {(com.purestorage.storage.policy.PureFlashArray=True) AND (com.purestorage.storage.replication.LocalSnapshotInterval=1.00:00:00) … FlashArray Replication 8 HOURS FlashArray Storage Policy. Remote Replication every 8 Hours, retained for 1 Days. {(com.purestorage.storage.policy.PureFlashArray=True) AND (com.purestorage.storage.replication.RemoteReplicationInterval=08:00:00… |
|
Sync-SpbmReplicationGroup |
Triggers an on demand snapshot replication job. This is ran against the target replication group and initiated from the target FlashArray. |
> Sync-SpbmReplicationGroup -ReplicationGroup '395a60c2-5803-40be-95b7-029b1b3ffc3e:62' -PointInTimeReplicaName 'PiT-3' Sync-SpbmReplicationGroup: 8/25/2020 3:48:04 PM Sync-SpbmReplicationGroup Error doing 'Sync' on replication group '30488813-7524-3538-868d-66c8037a6d39/395a60c2-5803-40be-95b7-029b1b3ffc3e:62'. Reason: Error 1: Sync of the replication group is ongoing. Ongoing task ID: 'SmsTask-SmsTask-90' > Sync-SpbmReplicationGroup -ReplicationGroup '395a60c2-5803-40be-95b7-029b1b3ffc3e:62' -PointInTimeReplicaName 'PiT-4' Sync-SpbmReplicationGroup: 8/25/2020 3:56:26 PM Sync-SpbmReplicationGroup Error doing 'Sync' on replication group '30488813-7524-3538-868d-66c8037a6d39/395a60c2-5803-40be-95b7-029b1b3ffc3e:62'. Reason: Error 1: Sync of the replication group is ongoing. Ongoing task ID: 'SmsTask-SmsTask-92' This type of error is an expected outcome for a syncReplicationGroup Task. The key here is that the "error" is saying that there is an "ongoing task". This means that a replication job was started and is now in progress. syncReplicationGroup is an async task and the Pure VASA provider will provide a task ID for async task. The cmdlet for syncReplicationGroup in PowerShell does not process task IDs or to query the VASA Provider for the task progress. |
|
Start-SpbmReplicationTestFailover | Performs a test failover against the target replication group - upon completion the replication group is in an INTEST state. |
> Start-SpbmReplicationTestFailover -ReplicationGroup '395a60c2-5803-40be-95b7-029b1b3ffc3e:62' [FlashArray-B-vVol-DS] rfc4122.918928d8-01aa-47f1-80cd-f31e66d5eac7/vVols-Rep-VM-1.vmx [FlashArray-B-vVol-DS] rfc4122.fa025596-332f-4e39-82e8-8055f7b589fb/vVols-Rep-VM-2.vmx [FlashArray-B-vVol-DS] rfc4122.f24fc678-26a4-4234-9356-3b712abbc20b/vVols-Rep-VM-3.vmx > Get-SpbmReplicationGroup -ReplicationGroup '395a60c2-5803-40be-95b7-029b1b3ffc3e:62' Name ReplicationState ---- ---------------- 395a60c2-5803-40be-95b7-029b1b3ffc3e:62 InTest |
|
Start-SpbmReplicationPromote | Promotes a target replication group from InTest to FailedOver state. |
Stop-SpbmReplicationTestFailover | Stops the test failover on the specified replication groups and performs a cleanup on the target site. |
> Stop-SpbmReplicationTestFailover -ReplicationGroup '395a60c2-5803-40be-95b7-029b1b3ffc3e:62' Name ReplicationState ---- ---------------- 395a60c2-5803-40be-95b7-… Target |
|
Start-SpbmReplicationPrepareFailover | Prepares the specified replication groups to failover - this is ran against the source replication group. |
> Start-SpbmReplicationPrepareFailover -ReplicationGroup 'sn1-x70-b05-33:vVols-Replication' |
|
Start-SpbmReplicationFailover | Performs a failover of the devices in the specified replication groups. |
> Start-SpbmReplicationFailover -ReplicationGroup '395a60c2-5803-40be-95b7-029b1b3ffc3e:62' Confirm Are you sure you want to perform this action? Performing the operation "Starting failover on" on target "Replication group '395a60c2-5803-40be-95b7-029b1b3ffc3e:62'". [Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help (default is "Y"): y [FlashArray-B-vVol-DS] rfc4122.f44d3e0f-f25d-4107-bbd6-9a8c2940720b/vVols-Rep-VM-1.vmx [FlashArray-B-vVol-DS] rfc4122.6a3f3e3c-755d-4c00-a63c-1fd4a69b1476/vVols-Rep-VM-2.vmx [FlashArray-B-vVol-DS] rfc4122.995eff3c-630c-4ea1-bf33-2eb0f06de84d/vVols-Rep-VM-3.vmx > Get-SpbmReplicationGroup -ReplicationGroup '395a60c2-5803-40be-95b7-029b1b3ffc3e:62' Name ReplicationState ---- ---------------- 395a60c2-5803-40be-95b7-029b1b3ffc3e:62 FailedOver |
|
Start-SpbmReplicationReverse |
Initiates reverse replication, this will reverse the state of the replication groups from source to target and target to source. |
> Start-SpbmReplicationReverse -ReplicationGroup '395a60c2-5803-40be-95b7-029b1b3ffc3e:62' Name ReplicationState ---- ---------------- sn1-x70-c05-33:r-vVols-R… Source |
Now that we have the Commands that can be used let's see how a Failover workflow would look like.
Overall, the workflow is straight forward, but in order to fully re-protect the VMs after the reverse, there are some extra steps that can be missed or skipped accidentally.
vVols Replication with the Pure Storage vSphere Client Plugin
With version 5.2.0 of the vSphere Client Plugin we have released the replication manager feature. The replication manager allows users to execute replication workflows for vVols based VMs that are using SPBM and replication groups. Please see the KB for more information on the exact process and workflows.
vVols Replication with vRealize Orchestrator
The Pure Storage vRO plugin contains workflows for vVols replication such as a testFailover and Failover for FlashArray replication groups. Additionally, there are workflows to assign storage policies and replication groups to VMs.
Pure is currently revamping the vRO documentation and there is currently no KB for running vVols based workflows. This section will be updated once Pure finishes up the KB that runs through managing vVols replication with vRO.
Please keep an eye on the KBs for vRO for more information.
vVols Replication with Site Recovery Manager
Full support with SRM and vVols FA replication is GA with the release of Pure Storage's VASA 1.1.0 (available with Purity 5.3.6+) and VMware's Site Recovery Manager 8.3.
This is integration is certified with VMware and Pure Storage.
Please refer to the Pure SRM user guide for further information.