A vVols best practices summary.
If using Virtual Volumes and FlashArray replication, ensure that anticipated recovery site is running vSphere 6.5 or later.
If using vVols array based replication for failover and recovery methods, Pure Storage strongly recommends running at minimum Purity//FA 5.3.6.
As always, please ensure you follow standard Pure Storage best practices for vSphere.
vVols Best Practices Quick Guidance Points
Here are some quick points of guidance when using vVols with the Pure Storage FlashArray. These are not meant to be Best Practices deep dives nor a comprehensive outline of all best practices when using vVols with Pure Storage; a Best Practices deep dive will be given in the future. However, more explanation about the requirements and recommendations are given in the summary above.
While vVols support was first introduced with Purity 5.0.0, there have been significant fixes and enhancements to the VASA provider in later releases of Purity. Because of this, Pure has set the required Purity version for vVols to a later release.
- For general vVols use, Purity 5.1.15+ or Purity 5.3.6+ is required.
- Purity 5.3.6+ is required for vVols support with Site Recovery Manager (SRM) Array Based Replication protection and recovery.
Pure Storage recommends (with vigor and energy) that customers running vVols upgrade to Purity//FA 5.3.12 or higher.
The main reason behind this is that there are enhancements to VASA to help support vVols at higher scale, performance of Managed Snapshots, and SPBM Replication Group Failover API at scale.
While vSphere Virtual Volumes 2.0 was released with vSphere 6.0, the Pure Storage FlashArray only supports vSphere Virtual Volumes 3.0, which was release with vSphere 6.5. As such, the minimum required vSphere version is 6.5 GA release. However, there are significant fixes specific to vVols so the required versions and recommended versions are as follows:
- Requirement: vSphere Version is 6.5 U3 or higher
- Recommended: vSphere Version is 6.7 U3 P03+ or vSphere Version 7.0 U1+
With the release of vSphere 6.7 U3 P03, VMware fixed a few major issues that customers had seen when migrating workloads to vVols. Pure Storage has these fixes tracked in a KB that outlines any VASA, vVols or Storage provider fixes per ESXi release. Please refer to this KB and VMware's vSphere release notes when planning your vSphere environments version recommendations.
With regards to the vSphere environment, there are some networking requirements and some strong recommendations from Pure Storage when implementing vVols in your vSphere Environment.
- Requirement: NTP must be configured the same across all ESXi hosts and vCenter Servers in the environment. The time and data must be configured to the current date/time.
- Recommended: Configure Syslog forwarding for vSphere environment.
- Requirement: Network port 8084 must be open and accessible from vCenter Servers and ESXi hosts to the FlashArray that will be used for vVols.
- Recommended: Use Virtual Machine Hardware version 11 or higher.
- The Best Practice is to use the recommended HW version that your vSphere Environment is running as long as it's at 11 or higher.
- Requirement: Do not run vCenter servers on vVols.
- While a vCenter server can run on vVols, in the event of any failure on the VASA Management Path combined with a vCenter server restart, the environment could enter a state where vCenter Server may not be able to boot or start. Please see the failure scenerio KB for more detail on this.
- Recommended: Either configured a SPBM policy to snapshot all of the vVol VM's Config vVols or manually put Config vVols in a FlashArray protection group with snapshot scheduled enabled.
- A snapshot of the Config vVol is required for the vSphere Plugin's VM undelete feature. Having a backup of the Config vVol also helps the recovery process or roll back process for the VM in the event that there is an issue. There is a detailed KB that outlines some of these workflows that can be found here.
Here is some more detail and color for the requirements and recommendations with the FlashArray:
- Requirement: The FlashArray Protocol Endpoint object 'pure-protocol-endpoint' must exist. The FlashArray admin must not rename, delete or otherwise edit the default FlashArray Protocol Endpoint.
- Currently, Pure Storage stores important information for the VASA Service with the pure-protocol-endpoint namespace. Destroying or renaming this object will cause VASA to be unable to forward requests to the database service in the FlashArray. This effectively makes the VASA Provider unable to process requests and the Management Path to fail. Pure Storage is working to correct this and improve this implementation in a future Purity release.
- Recommendation: Create a local array admin user when running Purity 5.1 and higher. This user should then be used when registering the storage providers in vCenter.
- Recommendation: Following vSphere Best Practices with the FlashArray, ESXi clusters should map to FlashArray host groups and ESXi hosts should map to FlashArray hosts.
- Recommendation: The protocol endpoint should be connected to host groups on the FlashArray and not to individual hosts.
- Recommendation: While multiple protocol endpoints can be created manually, the default device queue depth for protocol endpoints is 128 in ESXi; this means adding additional protocol endpoints is often unnecessary.
VASA Provider/Storage Provider
The FlashArray has a storage provider running on each FlashArray controller called the VASA Service. The VASA Service is part of the core Purity Service, meaning that it automatically starts when Purity is running on that controller. In vSphere, the VASA Providers will be registered as Storage Providers. While Storage Providers/VASA Providers can manage multiple Storage Arrays, the Pure VASA Provider will only manage the Array that it is running on. Even though the VASA Service is running and active on both controllers, vCenter will only use one VASA Provider as the active Storage Provider and the other VASA Provider will be the Standby Provider.
Here are some requirements and recommendations when working with the FlashArray VASA Provider.
- Requirement: Register both VASA Providers, CT0 and CT1, respectively.
- While it's possible to only register a single VASA provider, this leaves a single point of failure in your management path.
- Recommendation: Do not use a Active Directory user to register the storage providers.
- Should the AD service/server be running on vVols, Pure Storage strongly recommends not to use an AD user to register the storage providers. This leaves a single point of failure on the management path in the event that the AD User have permissions changed, password changed or the account is deleted.
- Recommendation: User a local array admin created to register the storage providers.
- Recommendation: Should the FlashArray be running Purity 5.3.6 or higher, Import CA signed certificates to VASA-CT0 and VASA-CT1
Managed Snapshots for vVols based VMs
One of the huge benefits of using vVols is the integration with storage and vSphere Manage Snapshots. The operations of the managed snapshot are offloaded to the FlashArray and there is no performance penalty for keeping the managed snapshots. When the operations behind managed snapshot are offloaded to VASA and the FlashArray, this creates additional work being done on the FlashArray that is not there with managed snapshots on VMFS VMs. Here are some points to keep in mind when using Managed Snapshots with vVols based VMs.
- Managed Snapshots for vVols based VMs create volumes for each Data vVol on that VM that have a -snap suffix in their naming.
- The process of taking a managed snapshot for a vVol based VM will first issue a Prepare Snapshot Virtual Volume operation which will cause VASA to take a consistent Point in Time volume snapshot of all Data vVols. Then the Snapshot Virtual Volume requests will follow, which will cause the snapshots to be copied out to volumes on the FlashArray for that VM.
- With FA volumes being created for the managed snapshot, this directly impacts the volume count on the FlashArray. For example, a vVol VM with 5 VMDK (Data vVols) will create 5 new volumes on the FA for each managed snapshot. If 3 managed snapshots are taken, then this VM has a volume count on the FA of 22 volumes (1 Config and 20 Data vVols while powered off; 1 additional Swap vVol while powered on).
- Managed Snapshots only trigger Point in Time snapshots of the Data vVols and not the Config vVol. In the event that the VM is deleted and a recovery of the VM is desired, it will manually have to be done from a pgroup snapshot.
- The process of VMware taking a managed snapshot is fairly serialized; specifically, the snapshotVirtualVolume operations are serialized. This means that if a VM has 3 VMDKs (Data vVols), the snapshotVIrtualVolume request will be issued for one VMDK and after it's complete the next VMDK will have the operation issued against it. The more VMDKs a VM has, the larger the impact to how long the managed snapshot will take to complete. This could increase the stun time for that VM.
- Recommendation: Plan accordingly when setting up managed snapshots (scheduled or manual) and configuring backup software which leverages managed snapshots for incremental backups. The size of the Data vVols and the amount of Data vVols per VM can impact how long the snapshot virtual volume op takes and how long the stun time can be for the VM.
Storage Policy Based Management (SPBM)
There are a few aspects of utilizing Storage Policies with vVols and the FlashArray to keep in mind when managing your vSphere Environment.
- Storage Policies can have one or multiple replication groups (FlashArray protection groups).
- SPBM Failover workflow APIs are ran against the replication group and not the storage policy itself.
- Recommendation: Attempt to keep replication groups under 100 VMs. This will assist with the VASA Ops being issued against the policies and replication groups and the time it takes to return these queries.
- This includes both Snapshot and Replication enabled protection groups. These VASA Ops, such as queryReplicationGroup, will look up all objects in both local replication and snapshot pgroups, as well as target protection groups. The more protection groups and the more objects in protection groups will inherently cause these queries to take longer. Please see vVols Deep Dive: Lifecycle of a VASA Operation for more information.
- Recommendation: Do not change the default storage policy with the vVols Datastore. This could cause issues in the vSphere UI when provisioning to the vVols Datastore.