With the initial release of vSphere 7.0, Pure Storage's HTML-5 vCenter plugin will not be fully integrated with NVMe-oF functionality. One of the integrations not yet available is setting up the vSphere environment automatically (like we do with iSCSI). This KB is meant to serve as a guide for how to manually setup your environment.
NOTE: This guide is specific to Pure Storage and vSphere setup; this guide will NOT include setting up the switched fabric. Please see Configuring an Arista Switch for use with the Pure Storage FlashArray NVMe/RoCE for information on these steps.
Configuring the vSphere Environment
Configuration of RDMA NICs on ESXi
Mellanox RDMA NICs
Enabling Priority Flow Contol (PFC) and DSCP
NVMe/RoCE requires a lossless fabric. This requires that the adapter be configured for Per-priority Flow Control (PFC) and that it trust the QOS bits in the IP header in order to queue the traffic in and out of the appropriate queues on the adapter. This is not enabled by default on the Mellanox adapters and needs to be configured manually.
1. Run the following commands via SSH on all the ESXi hosts that will be connected via NVMe-RDMA:
[root@barney:~] esxcli system module parameters set -m nmlx5_core -p "pfctx=0x08 pfcrx=0x08 trust_state=2 max_vfs=4" [root@barney:~] esxcli system module parameters set -m nmlx5_rdma -p "dscp_force=26"
2. Reboot the host.
If you have a Mellanox adapter using the nmlx4_core driver then perform the steps listed below before rebooting.
3. Verify the changes are in place by running the following commands:
[root@barney:~] esxcli system module parameters list -m nmlx5_core |grep pfc.*int pfcrx int 0x08 Priority based Flow Control policy on RX. pfctx int 0x08 Priority based Flow Control policy on TX.
4. Verify the RDMA NIC is sending / receiving in the Priority 3 class (replace "vmnic5" with your applicable NIC):
[root@barney:~] vsish -e cat /net/pNics/vmnic5/stats |grep xPrio3 rxPrio3_bytes: 2876110056 rxPrio3_packets: 715770 txPrio3_bytes: 47429772 txPrio3_packets: 64836
Enabling RoCEv2 on Mellanox adapters using nmlx4_core drivers
If the ESXi hosts are utilizing Mellanox adapters with the nmlx4_core driver, you must enable RoCEv2 as the default operating mode is RoCEv1 for this particular driver.
In order to determine which driver your Mellanox adapters are using look under the Driver section when running the following command on the ESXi host(s):
[root@barney:~] esxcfg-nics -l |grep -E 'Name|Mellanox' Name PCI Driver Link Speed Duplex MAC Address MTU Description vmnic4 0000:42:00.0 nmlx5_core Up 25000Mbps Full ec:0d:9a:82:5a:32 9000 Mellanox Technologies ConnectX-4 Lx EN NIC; 25GbE; dual-port SFP28; (MCX4121A-ACA) vmnic5 0000:42:00.1 nmlx5_core Up 25000Mbps Full ec:0d:9a:82:5a:33 9000 Mellanox Technologies ConnectX-4 Lx EN NIC; 25GbE; dual-port SFP28; (MCX4121A-ACA)
We can confirm in the above example that the Mellanox adapter is utilizing the nmlx5_core driver so changes wouldn't be required here. If yours reports nmlx4_core then follow the steps below.
The steps below are only applicable to the nmlx4_core driver. No action is required if the Mellanox adapter is using the nmlx5_core driver.
1. Run the following command via SSH on all the ESXi hosts that will be connected via NVMe-RDMA:
esxcli system module parameters set -p enable_rocev2=1 -m nmlx4_core
2. Reboot the host.
Once the host has been rebooted you are able to confirm the change has taken effect by running the following command:
[root@barney:~] esxcli system module parameters list --module=nmlx4_core |grep -E 'Name|enable_rocev2' Name Type Value Description enable_rocev2 int 1 Enable RoCEv2 mode for all devices
Broadcom RDMA NICs
The following steps must be taken for all Broadcom RDMA adapters.
1. Ensure that the (NIC + RDMA mode) is enabled for the RDMA NICs on the Device Configuration page in the BIOS of the host.
Refer to the vendor documentation on how to enter BIOS and modify this value.
2. Once you have confirmed NIC + RDMA mode is abled, run the following commands on the ESXi hosts to enable RoCEv2:
esxcli system module parameters set -m bnxtnet -p disable_roce=0 esxcli system module parameters set -m bnxtroce -p disable_rocev2=0
3. Reboot the host.
Again, to verify the settings have taken place, run the following commands:
[root@barney:~] esxcli system module parameters list --module=bnxtnet |grep -E 'Name|disable_roce' Name Type Value Description disable_roce bool 0 Disable the RoCE support. 0: enable RoCE support, 1: disable RoCE support. [Default: 1]
[root@barney:~] esxcli system module parameters list --module=bnxtroce |grep -E 'Name|disable_rocev2' Name Type Value Description disable_rocev2 bool 0 set to 1 to disable ROCEv2 support, set to 0 to enable RoCEv2 support (default)
Creation of vSwitches, port groups, and vmkernel ports
Once you have enabled the appropriate configuration on your physical RDMA NICs (if applicable), the next step is to configure the vSwitches, port groups, and vmkernel ports on the ESXi hosts. If you have set up iSCSI previously you will notice that it is a very similar setup and configuration process.
The below configuration is only one option (using standard vSwitches). If you would like to use a vSphere Distributed Switches (vDS) then that is also an acceptable configuration option as well.
The important points to consider when setting up your environment:
- At least 2 different port groups are required. (You may have up to 4)
- Each port group should have only 1 physical RDMA NIC port in the "Active adapters" section. The other adapter port(s) should be in the "Unused adapters" section.
- Ensure MTU is configured consistently between the vmkernel adapters and the virtual switch.
Example setup configuration for standard vSwitches:
1. Select the desired ESXi host, select the Configure tab, locate the Networking section, and select Virtual switches.
Once you are on the Virtual switches page, select Add Networking.
2. Select VMkernel Network Adapter.
3. Select New standard switch and input the desired MTU size.
4. Add one of the desired Physical NIC ports to the Active Adapters leaving the Standby and Unused adapters empty.
5. Input the desired name of your Port Group Network label and any other applicable settings. Ensure you leave all of the Enabled services unchecked.
6. Input the desired IP address and subnet mask for the vmknic.
7. Review all of the listed fields and ensure the configuration is correct, then Finish the setup.
Step 8: Repeat steps 1 - 7 for the other vSwitch, port group, and vmkernel port.
A common question when configuring RDMA capable NICs (RNICs) for use with ESXi is, "Can I configure the RNIC with multiple vSphere Services (NVMe-RDMA, vMotion, Replication, etc) to take advantage of the higher performing network adapter?".
While it is possible to configure each RNIC will multiple services it is not recommended by VMware or Pure Storage. This is primarily due to the complexity required in order to ensure every service works as expected when this configuration is in use.
While it is not currently recommended this may change in the future as Pure Storage and VMware work together to investigate how this can be better supported.
Creation and Configuration of VMware Software NVMe-RDMA Storage Adapters
After each ESXi host is properly configured and setup with network connectivity the next step is to create the Software NVMe-RDMA Storage Adapters.
It is important to point out at that you will add at least two "Software NVMe over RDMA adapters" to each ESXi host. This is better outlined in Step #2 below.
1. Select the desired ESXi host, select the Configure tab, locate the "Storage" section, and select Storage Adapters.
Once you are on the Storage Adapters page, select Add Software Adapter.
2. Select Add software NVMe over RDMA adapter and choose which RDMA device you want to add.
You will repeat this process for all RDMA ports you plan on using.
If you have more than 2 RDMA adapters available and do not plan on using all of them, you can look at the following ESXi host CLI output to compare which physical ports are associated with each virtual RDMA adapter:
[root@barney:~] esxcli rdma device list Name Driver State MTU Speed Paired Uplink Description ------- ---------- ------ ---- ------- ------------- ----------- vmrdma0 nmlx5_rdma Active 4096 25 Gbps vmnic4 MT27630 Family [ConnectX-4 LX] vmrdma1 nmlx5_rdma Active 4096 25 Gbps vmnic5 MT27630 Family [ConnectX-4 LX]
3. After you have added all applicable Software NVMe-RDMA Storage Adapters the next step is to configure the Controllers for every adapter.
You will select the applicable adapter you wish to configure, click Controllers and then select Add Controller.
4. You can decide to Automatically discover controllers or Enter controller details manually. For simplicity, an automatic discovery is recommended otherwise directed by Pure Storage.
You will repeat this process for ALL FlashArray IP addresses dedicated for NVMe-RDMA connectivity.
Identifying the NVMe Qualified Name (NQN) of an ESXi host
After you have configured your ESXi hosts the next step is recording the NVMe Qualified Name (NQN) of each ESXi host you plan on connecting to the FlashArray.
[root@barney:~] esxcli nvme info get Host NQN: nqn.2014-08.com.vmware:nvme:barney
Until NVMe-oF is implemented into the vSphere API, this is the only known way to get this information. This KB will be updated with a PowerShell and Python example once the API becomes available.
Configuring the FlashArray
Creation of Host & Host Group Objects
From the FlashArray perspective, there isn't going to be much of a difference from setting up iSCSI or FC. If there are questions for setting up hosts and host groups, then you can refer to the FlashArray Configuration KB as that should sufficiently guide you through this process.
As a complimentary addition to that KB, below is an example of adding an NQN to a host object on the FlashArray: