Considerations for Deploying Hyper-V Hosts Using FlashArray
Table Of Contents
Introduction
Every facet of Hyper-V and its underlying platform, including, but not limited to: the hypervisor host, the physical machine used by the hypervisor, network and interconnects, storage components, and more must be chosen and configured to ensure both the best possible performance as well as the highest availability possible with the given budget and documented requirements. This section will discuss how to properly configure Hyper-V for in-guest workloads as well as FlashArray.
An example of a fully deployed Hyper-V solution can be seen in the article “Hyper-V PoC Automation Guide” in Pure’s Microsoft Platform Guide.
General Considerations for Hyper-V Hosts
This section will discuss general considerations for servers that will be Hyper-V hosts.
To see the maximum configuration limitations for Hyper-V hosts and VMs, consult the Microsoft Learn article “Plan for Hyper-V scalability in Windows Server”.
Version of Windows Server
Per the Microsoft Support Matrix in the Microsoft Platform Guide, Windows Server versions are fully supported with Pure Integrations up to the Extended End Date listed in Microsoft’s lifecycle documentation. Implement versions of Windows Server for Hyper-V hosts that ensure supportability by both Microsoft and Pure for the production lifecycle of the systems.
Processor-Related Power Settings for Hosts
To ensure maximum performance of each server functioning as a Hyper-V host, the server should have all resources available to it. The default power policies or settings both at the hardware and OS layers usually allow the ability to automatically slow or halt less utilized cores to save power. Leaving these settings has a dramatic effect not only on all VMs running on that host, network performance, and more. Performance matters.
BIOS/UEFI Power Settings for Processors
Hardware vendors have settings in BIOS/UEFI to regulate how much power each processor core can use. By default, the settings usually allow the server to reduce core speed to save power. The setting(s) that control this behavior vary from manufacturer to manufacturer. For example, this setting is sometimes referred to as a C-State, while others may call it something different. Note that sometimes to ensure full power, multiple related settings may need to be configured. Vendors may also have recommended profiles for virtualization.
Refer to the hardware vendor’s documentation for the server’s particular CPU configuration to determine the appropriate CPU power profile configuration.
Windows Server Power Plan
By default, Windows Server’s power plan is set to Balanced. Set it to High performance for SQL Server installations, as shown in Figure 1. If this setting cannot be set in Windows Server, check Group Policy to ensure that it is not set somewhere that is not obvious and more challenging to alter.
Figure 1. Power plan in Windows Server
Security Software
While the use of antivirus, anti-malware, or other security software is common in most environments, specific exclusions should be implemented to ensure that these security tools do not interfere with the Hyper-V host. Microsoft publishes a list of known files, folders, and extensions to exclude from scanning for your security software. You can find that list in the Microsoft Learn article “Recommended antivirus exclusions for Hyper-V hosts”.
Host Availability
Businesses today require more uptime than ever from their systems. The goal is to provide business continuity which also encompasses local high availability as well as disaster recovery. Business continuity targets must be agreed upon and documented as they guide the solutions that are deployed. Examples include overall service level agreements (SLAs), recovery time objectives (RTOs), and recovery point objectives (RPOs).
There are two kinds of downtime: planned and unplanned. Unplanned downtime is exactly what it sounds like: unexpected events such as operating system issues, application-level errors, or hardware failures can spring up at any time with little to no warning causing an outage. No matter how brief, downtime still impacts the business. Planned downtime often occurs due to routine operations such as physical server, operating system, or other periodic maintenance tasks that require some level of outages. An example would be patching or updating a Windows Server host that would then need a reboot to complete the task.
Everything must be accounted for including the underlying infrastructure, the server and the hardware within it, Windows Server and Hyper-V, as well as the VMs and what is deployed in them. Reducing the impact
The key to availability is redundancy. This means that the servers running Hyper-V must be resilient to failure. Core components such as power should not have single points of failure. Specifics around storage and networking will be discussed in their upcoming sections.
Examples include having more than one physical network adapter (NIC or pNIC) and/or Host Bus Adapter (HBA). NICs or HBAs often have multiple connections on a single card which are technically redundant, but if the card or backplane interface itself fails, you will lose connectivity to networking or storage. Consider options such as having multiple pNICs or HBAs on different hardware backplane connections so that adapter active multipathing will absorb a failure and either prevent or minimize an outage (assuming proper configuration).
At the Windows Server layer, should one hypervisor host fail, another must be able to now run the VMs from the failed host. In Windows Server, a Windows Server Failover Cluster (WSFC), enabled by the Failover Clustering feature, provides this functionality. For more information on WSFCs, see the Microsoft Learn article “Failover Clustering in Windows Server and Azure Stack HCI”.
Since this article is focused on Hyper-V hosts, it will not discuss VM-level availability. ActiveCluster and ActiveDR can be important features of FlashArray to explore to protect both Hyper-V hosts as well as VMs. See the paper Microsoft Hyper-V Stretched Cluster with ActiveCluster for more information on Hyper-V and ActiveCluster.
Networking
Networking is the heart of the modern data center. A VM running on Hyper-V relies on the hypervisor host and all the layers below it to provide the networking required by the VM which includes network performance, reliability, and availability. This section will discuss how to configure networking in Hyper-V and at the hardware layer.
Types of Networks for Hyper-V
Multiple logical and/or physical networks are recommended for segmenting and isolating certain workloads. Some common network traffic that are usually segmented include:
● Host management
● Live Migration communication
● Storage connectivity and communication if using iSCSI or shared storage presentation such as
● WSFC-level communication
● VM communication
For example, iSCSI traffic should be isolated to reduce broadcast traffic, improve security, and allow for QoS prioritization to improve and add consistency to storage traffic performance.
Virtual Local Area Network
A Virtual Local Area Network (VLAN) is a logically defined network that partitions an existing network. The use of VLANs is a straightforward means to segment and isolate these various types of traffic without the added complexity and expense of multiple physical networks and additional network adapters.
VLAN tagging is how a logical VLAN is mapped and presented for use to the Hyper-V host and a VM’s NIC traffic. The VLAN is assigned as a number corresponding to the logical network number on the physical switch port. If a VLAN tag is assigned as a default on a switch port, a VLAN tag in Hyper-V is not necessary, as the network traffic will automatically be assigned the VLAN tag after the communication leaves the Hyper-V host.
To set a VLAN on a NIC, use the PowerShell cmdlet Set-NetAdapter. An example setting the VLAN to 52 is as follows:
Set-NetAdapter -Name "Ethernet 1" -VlanID 52
If a network traffic source is exceptionally demanding, such as high I/O workloads, consider extending this separation to a new physical network – not just a logical one on the same network. Use additional physical NICs and physical switch ports or switches dedicated to a traffic source, such as iSCSI traffic for CSVs, to fully maximize the available throughput and minimize congestion with other traffic types and sources.
Underlying Network Availability
Seemingly small network events can have ripple effects and trigger an outage downstream, especially for WSFCs. Network segmentation can help isolate and prioritize certain types of traffic. Performance-oriented settings can also boost the speed of this interconnect architecture.
Network architecture is rather straightforward at first. Connect a physical host to more than a single switch to ensure redundancy in and out of the host. There should be multiple physical NICs – not just multiple ports on a single NIC – to connect the server to these switches.
Redundancy and Isolation
To start, understand the physical hardware topology inside and outside of the host server. Network adapter or interconnect cables can fail, as can a networking switch port or even the switch itself. The failure can be much more localized. If a dual-port network adapter is a mezzanine card attached to the mainboard of the server, and the mezzanine card suffers a failure, both of the network adapter ports can experience an outage at the same time. As a result, plan for failure with adapter replacement. Multiple network adapters with different host backplane connections can help minimize the impact of an adapter or port failure.
Workloads can also saturate one or more interconnects. Even mundane tasks such as server or application-level backups can saturate a network and cause cascading impacts to other services that depend on that interconnect path. Depending on your server architecture, multiple networks can help minimize the impact of these operations. Separating purposes such as server and host management, CSV connectivity, Live Migration communication, and backups could be better suited with multiple network adapters specific to a single purpose.
In some cases, having multiple sets of network adapters for different purposes can be overbearing and a cable or switch port management nightmare. Limiting network visibility through either port isolation or VLAN tagging can help multiple distinct purposes coexist on the same physical network while providing logical isolation for each of these tasks. Quality of Service (QoS) can help prioritize certain types of traffic over others to minimize the impact to these workloads. VLAN tagging to segment the network traffic can be managed within the network configuration of the Hyper-V host, as long as the physical network is configured appropriately.
Flow Control
Consider enabling the network flow control setting on the physical switch to optimize performance. Flow control is a mechanism for briefly stopping the flow of data to avoid packet loss if network congestion is present. Consult your switch vendor's documentation for information on how they implement flow control.
Virtual Switches
Hyper-V requires at least one virtual switch (vSwitch). The vSwitch is a software-based layer-2 Ethernet switch. Depending on your requirements, you may have more than one vSwitch. Depending on how the vSwitch is configured, it can be allowed to talk to physical networks, just the VMs on the host, or VMs only. Each virtual network adapter (vNIC) configured on a VM is assigned to a vSwitch. An example is shown in Figure 2.
Figure 2. Virtual Switch Manager
For more information, see the Microsoft Learn article “Plan for Hyper-V networking in Windows Server”.
Physical Network Card Configuration
This section will discuss the configuration of network cards in the Hyper-V host.
Network Card Power Management
Disable any network adapter power management to ensure consistent performance, especially if using network-based storage such as iSCSI.
Settings are often specific to a vendor's network adapter. Use the documentation in Microsoft learn for the PowerShell cmdlet Disable-NetAdapterPowerManagement to see which options are available for disabling any power management settings present. Note that the use of this command could temporarily restart the network adapter and interrupt server connectivity. Below is an example execution that disables all power management features on a NIC named Ethernet 1.
Disable-NetAdapterPowerManagement -Name "Ethernet 1"
Single Root I/O Virtualization
Single Root I/O Virtualization (SR-IOV) is an extension of the PCI Express (PCIe) specification that allows a device, including network adapters, to separate access to its resources among various hardware functions and differentiate between different traffic streams. It allows traffic streams to be delivered directly to the Hyper-V or VM layers separately. It enables networking traffic to
SR-IOV must be enabled in UEFI/BIOS. To check that this has been set, issue the following PowerShell command on each Hyper-V host:
(Get-VMHost).IovSupport
This should return a value of True if it is available and False if it is not or just not enabled.
Enabling SR-IOV on a NIC can be done via PowerShell. An example is shown below.
Enable-NetAdapterSriov -Name "Ethernet 1"
Figure 3 also shows it enabled in the driver for a NIC.
Figure 3. SR-IOV enabled on a NIC
To enable SR-IOV on the Hyper-V vSwitch, open the Virtual Switch Manager. Select or create the virtual switch, and next to the External network connection type, select “Enable single-root I/O virtualization (SR-IOV)”, as shown in Figure 4.
Figure 4. Single-Root I/O Virtualization in Hyper-V Manager
Alternatively, use PowerShell to enable SR-IOV when creating a new virtual switch. An example is shown below.
New-VMSwitch <virtual-switch-name> -NetAdapterName <network-adapter-name> -EnableIov $true
Finally, Figure 5 shows enabling SR-IOV on a VM's vNIC.
Figure 5. Single-Root I/O Virtualization for a vNIC in Hyper-V Manager
Remote Direct Memory Access
The use of Remote Direct Memory Access, or RDMA, is highly encouraged for high-speed interconnects between Hyper-V hosts for functionality such as Live Migration. RDMA requires both RDMA-enabled NICs and switches that support RDMA to exchange data in main memory without requiring the CPU or operating system to act as an intermediary. RDMA-enabled server adapters boost performance because it reduces the overhead of resource consumption (mainly CPU) and increases network throughput rates and lower latency.
RDMA is currently not supported between a Hyper-V host and a FlashArray. Therefore, RDMA cannot be used for storage communication.
Receive Side Scaling
To enable RSS for all physical adapters in the Hyper-V host, the following script can be used. First, RSS is enabled for use globally, then enabled per physical network adapter.
Command Line example enabling RSS globally.
netsh interface tcp set global rss=enabled
Using PowerShell to enable RSS on each adapter.
foreach($NIC in (Get-NetAdapter -Physical)) {
Enable-NetAdapterRSS -name $NIC.name
}
Figure 6 shows RSS enabled on a NIC.
Figure 6. RSS enabled
Virtual Machine Queue
Virtual Machine Queue (VMQ) can be enabled on the physical NICs on the Hyper-V host. VMQ is a network adapter offload technology that can extend Native RSS to the Hyper-V layer. The goal is to allow for Hyper-V to manage the network traffic processing queues in the host hypervisor layer and both reduce host CPU consumption and improve guest network throughput by spreading out the CPU load across multiple host processors and by using direct memory access to transfer network packets directly into a VM’s shared memory. This feature must be supported by the physical network adapter, and once enabled at the host as shown in Figure 7, can be enabled per vNIC.
Figure 7. VMQ enabled on a NIC
RSS cannot be enabled on network adapters
NIC Teaming
Microsoft has natively supported NIC teaming, or the ability to combine multiple NICs as a single NIC, since Windows Server 2012. Prior to that, teaming was achieved with proprietary software and did not always work well with features like a WSFC. Teaming can enhance both performance and availability depending on the configuration if employed. To understand the teaming in Windows Server, reference “Windows Supported Networking Scenarios” at Microsoft Learn.
There are two types of teaming:
● Switch independent – This type is independent of the network switch.
●
For Windows Server 2016 and above, Switch Embedded Teaming (SET) is recommended to be enabled for Hyper-V hosts. SET was introduced with Windows Server and System Center Virtual Machine Manager (SCVMM) 2016 as a new method to simplify the teaming of multiple network adapters. SET is managed at the Hyper-V switch level and not at a network team level. Physical NIC teaming is no longer required to allow more concurrent traffic across multiple network adapters. SET is specifically integrated with Packet Direct, Converged RDMA vNICs, and software defined networking (SDN) quality of service (QoS) configurations. SET can be enabled within SCVMM by configuring a logical switch with an embedded team uplink mode.
Examples of enabling SET using PowerShell on vSwitch are below. Change the names of your network adapters to match your hardware.
New Switch
New-VMSwitch -Name "VMSETSwitch" -NetAdapterName "Ethernet 3", "Ethernet 4" -EnableEmbeddedTeaming $true
Existing Switch
Get-VMSwitch | FL Name, EmbeddedTeamingEnabled, NetAdapterInterfaceDescriptions, SwitchType
For more information on SDN, refer to the article “Software Defined Networking (SDN) in Azure Stack HCI and Windows Server” in Microsoft Learn.
For more on NIC teaming and iSCSI, see the upcoming section “iSCSI Configuration Best Practices”.
Storage
The storage behind Hyper-V is critical. This section will discuss the considerations for ensuring storage for the hosts and VMs are configured optimally.
Pure has a PowerShell cmdlet as part of the Pure Storage PowerShell Toolkit called Test-WindowsBestPractices that should be used to ensure the Hyper-V host is configured properly. See the article “Validate Windows Server with Test-WindowsBestPractices Cmdlet” in the Microsoft Platform Guide for more information.
Host Storage Connectivity
A FlashArray can be connected to a Hyper-V host via Fibre Channel (FC) or iSCSI. This section will cover both.
As noted above, RDMA cannot be used to connect to a FlashArray from a Hyper-V host. Therefore, RDMA cannot be used for storage communication.
Multiple Paths to FlashArray
Both FC and iSCSI support the Multipath IO (MPIO) feature of Windows Server. This feature allows a Hyper-V host to communicate to and from storage arrays across multiple paths to provide improved performance, load balancing, and resiliency for storage connections. Each physical HBA should have ports connected via redundant switching to different controller interconnects in the storage array. FlashArray has multiple interconnects on each redundant controller. An example of multiple hosts connected with active multipathing to a FlashArray is shown in Figure 8.
Figure 8. Active multipathing
A physical host should be connected to more than a single switch to ensure redundancy in and out of the host. These redundant switches are used to connect each physical server and devices like shared storage in a redundant manner. Physical interfaces, either network adapters or Host Bus Adapters (HBAs), or both, connect the physical server to the switches, and should be physically redundant within the host. One or more logical virtual switches at the Hyper-V layer act as an intermediary to pass through and route VM traffic to the physical adapter(s).
The VM’s vNIC or virtual Fibre Channel Adapter (vHBA) is then bound to a virtual switch port to complete the end-to-end connection to the physical network. A number of redundancy and performance-oriented factors need to be reviewed to ensure the network architecture is optimal for Hyper-V and storage.
Fibre Channel Best Practices
A Host Bus Adapters (HBA) is a dedicated fiber-optic transceiver adapter to connect to Fibre Channel (FC)-based storage networks. A FlashArray can be connected to a Hyper-V host via FC.
Physical HBAs have a tunable setting on the physical device called queue depth. The queue depth value specifies the number of I/O requests that the host will put into a queue for a target storage port. If the maximum queue depth is reached, the device will reject and discard additional commands, forcing the Hyper-V host to reissue the I/O request a short time later. This results in latency to the host and/or the VM. If this discard and retry occurs, the observed storage latency inside the Hyper-V VM will increase, potentially causing performance degradation for the workloads.
When switching to faster storage such as FlashArray, default queue depth values may need to be changed. If necessary, adjusting the queue depth to a higher vale can help boost performance levels from the Hyper-V host and VM workloads. Increasing it too high can result in performance degradation from overloading the SAN controllers, as the ports might be unable to handle such a high rate of concurrent commands. All Hyper-V hosts connected to these ports could be negatively impacted.
Check with your HBA hardware vendor for the value of your HBA queue depths and adjustment process, and test accordingly with a combination of your application workload and synthetic benchmarking tools to find the right queue depth balance for your storage array.
A good reference to see how to tune FlashArray with FC-based implementations is the article “SAN Guidelines for Maximizing Pure Performance”.
iSCSI Best Practices
iSCSI uses NICs to connect to FlashArray to be able to presented storage as block to the Hyper-V host.
To properly implement iSCSI and MPIO on Hyper-V hosts with FlashArray, read the document “iSCSI Best Practices for Windows Server and FlashArray”, “Setup iSCSI on Windows Server”, and the MPIO articles in the Microsoft Platform Guide. This includes information on configuring settings such as Delayed Acknowledgement (Delayed Ack for short) which can potentially reduce end-to-end latency as well as TcpAckFrequency and TcpNoDelay, which is covered in the iSCSI best practices document above.
For iSCSI-based Pure Storage connections, bi-directional flow control should be enabled for all switch ports that are used for iSCSI traffic.
At least one dedicated isolated network or VLAN should be configured for iSCSI traffic.
If more than one network is used for iSCSI and you are clustering the Hyper-V hosts as a WSFC, each iSCSI network must have its own subnet. A WSFC requires a distinct network for each network that will show up in the WSFC which means a separate subnet for each. There cannot be just one subnet for iSCSI. This will also affect how things are configured on the FlashArray as well.
Do not use NIC teaming for iSCSI networks; use the native MPIO stack of Windows Server. The Multipath-IO (MPIO) feature of Windows Server is more efficient and therefore faster for this type of traffic.
Jumbo frames is a setting that may improve performance for some configurations that use iSCSI. Before configuring jumbo frames, evaluate whether or not the effort and the cost of managing jumbo frames outweighs the potential performance gain. If the performance gain will be large, the endeavor may be worth it. Tuning jumbo frames involves changing the Maximum Transmission Unit (MTU) which usually defaults to 1500 and can be set as high as 9000. While this allows iSCSI network communication to fully leverage the full available bandwidth of the physical network, there are caveats and challenges. All switch port hops and end-to-end connections between the Hyper-V hosts, physical and virtual switches, and enterprise storage must be fully configured for jumbo frames to achieve the full networking rate of speed. If anything in that chain is misconfigured, payload mismatches will result in a severe degradation of performance or at worst, no connectivity to storage.
As part of the MPIO configuration, use either the Round Robin (RR) or Least Queue Depth (LQD) load balancing policies to improve the performance and utilization of all available paths to the underlying storage. For hosts with 10 or fewer paths to a FlashArray volume, LQD is recommended for use. Test each algorithm to determine which works more effectively for your workload. Follow the best practices documented in the Microsoft Platform Guide for MPIO and iSCSI.
Settings are configured with Pure’s best practices cmdlet Test-WindowsBestPractices mentioned earlier.
Presenting Storage to Hosts
Presenting storage to Hyper-V hosts, which in turn make storage available to VMs, is no different than presenting storage to Windows Server that does not use Hyper-V. The considerations for Hyper-V revolve around how presented storage is configured in Windows Server.
Hyper-V has two main methods to present storage for use with VMs: Cluster Shared Volumes (CSVs) and Storage Spaces Direct (S2D). Both require a WSFC. S2D is an OS-based software-defined storage platform. It uses OS-level storage replication to make copies of data on locally-attached storage to other hosts in a host cluster. Storage only presented to a single host (“locally attached storage”) should not be used for Hyper-V implementations when FlashArray is involved.
For FlashArray, implementations of Hyper-V should consider CSV as well as a Scale Out File Server (SOFS) to store the virtual disks used by VMs. Pure published a document on SOFS configuration here.
Both features rely on a WSFC. A CSV enables multiple nodes of a Hyper-V cluster to have concurrent read and write access to the underlying Volume(s). With CSVs, the storage can be owned by any host participating in the WSFC and can be failed over to another host without disrupting a VM’s connectivity to its storage or overall availability. Figure 9 shows a configured CSV.
Figure 9. Cluster Shared Volume in Failover Cluster Manager
How many CSVs are required? The answer is “it depends”. Configure as many as necessary based on your workload criteria and requirements. Create volumes on FlashArray and place your workloads carefully. Common considerations include determining which VMs that are required to be able to have snapshots and/or be replicated at the same
Consider enabling the CSV Cache to improve read performance for commonly accessed data. Further details on the CSV Cache are available at the article “How to Enable CSV Cache” in Microsoft Learn. Microsoft’s documentation for CSVs can be found in the Learn article “Use Cluster Shared Volumes in a failover cluster” and with their free training “Introduction to Cluster Shared Volumes”.
Pure’s documentation on how to properly configure a CSV with FlashArray can be found in the article “Working with Cluster Shared Volumes on a Windows Server Host”.
To back up a CSV, Pure currently utilizes the Pure Storage Hardware Provider for Microsoft VSS. This utility allows VSS to produce application-consistent volume backups. More details on this utility are found in the article “Volume Shadow Service (VSS) Hardware Provider”.
While Microsoft supports SMB for Hyper-V deployments and FlashArray supports SMB, consider block-based, not network-based storage for Hyper-V deployments. FlashArray does currently not support persistent file handles or continuously available for SMB. This means in the event of a storage controller failure, this could impact VMs that are running on a Hyper-V host connected to FlashArray. If you are looking to use SMB with Hyper-V on FlashArray, talk to Pure.
Summary
The considerations in this article should ensure that Hyper-V hosts are optimally configured when using FlashArray.