Skip to main content
Pure Technical Services

Configuring a Cisco Switch for use with Pure FlashArray NVMe/RoCE

Currently viewing public documentation. Please login to access the full scope of documentation.

Overview

NVMe/RoCE is one of the transports that can be used to present a remote NVMe namespace to a host as if were a local device.  The transport leverages RoCEv2 which uses IPv4 user datagram protocol (UDP).

NVMe/RoCE traffic can be passed between the initiator and target using standard ethernet and ip routing capabilities. RoCE requires a lossless network. In order to provide these capabilities the switch will need support industry standard congestion control mechanisms.

Network Topology - Single Hop

A single hop network topology is one where the initiator and targets are separated by a single switch hop.  In this design there should be 2 switches and at least 1 port from each initiator connected to each switch and a port from each controller on the FlashArray connected to each switch as shown in the diagram.

Cisco Single Card Initiator NVMe Fabric Addressing 4 port configuration.jpeg

In this design there should be two subnets with a VLAN on each switch.  Those subnets should not be transported trunked on any connections between the two switches.  The array and initiator should have ports configured in each subnet and those ports connected to the corresponding access ports on the switches.

Cisco Switch Requirements

The following are the minimum requirements for the Cisco switch:

  • NX-OS version 7.0(3)I4(7) or greater

  • Support for 100/50/25G

  • Support for Jumbo Frames (minimum 9000 Bytes)

  • Support for the following QoS features:

    • Per-Priority Flow Control (PFC)

    • Data Center Bridging Extensions (DCBX)

    • QoS interface trust for DSCP and COS

    • 8 queues per port

    • DSCP based classification and remarking

 

The following switch models have been validated by Pure Storage:

  • Cisco 9236C

  • Cisco 93180-YC

Switch Configuration

The switch configuration consists of configuring global (switch wide) parameters, and interface parameters for the initiator (host server) and the target (FlashArray).

Global configuration

For each switch create a VLAN for the NVMe/RoCE traffic.  You will also need to configure the appropriate QoS settings as shown here.

Switch A

switchA#config t
switchA(config)#vlan 986
switchA(config-vlan-986)#name NVMe-RoCE VLAN
switchA(config-vlan-986)#policy-map type network-qos ROCE-NQ-policy
switchA(config-pmap-nqos)#class type network-qos c-8q-nq3
switchA(config-pmap-nqos-c)#pause pfc-cos 3
switchA(config-pmap-nqos-c)#mtu 9216
switchA(config-pmap-nqos-c)#class-map type qos match-any ROCE-class
switchA(config-cmap-qos)#match dscp 26
switchA(config-cmap-qos)#policy-map type qos RoCE
switchA(config-pmap-qos)#class ROCE-class
switchA(config-pmap-c-qos)#set qos-group 3 
switchA(config-pmap-c-qos)#system qos
switchA(config-sys-qos)#service-policy type network-qos RoCE-NQ-Policy
switchA(config-sys-qos)#end
switchA#copy run start

Switch B

switchB#config t
switchB(config)#vlan 987
switchB(config-vlan-987)#name NVMe-RoCE VLAN
switchB(config-vlan-987)#policy-map type network-qos ROCE-NQ-policy
switchB(config-pmap-nqos)#class type network-qos c-8q-nq3
switchB(config-pmap-nqos-c)#pause pfc-cos 3
switchB(config-pmap-nqos-c)#mtu 9216
switchB(config-pmap-nqos-c)#class-map type qos match-any ROCE-class
switchB(config-cmap-qos)#match dscp 26
switchB(config-cmap-qos)#policy-map type qos RoCE
switchB(config-pmap-qos)#class ROCE-class
switchB(config-pmap-c-qos)#set qos-group 3 
switchB(config-pmap-c-qos)#system qos
switchB(config-sys-qos)#service-policy type network-qos RoCE-NQ-Policy
switchB(config-sys-qos)#end
switchB#copy run start

Configure Array Interfaces

For the Array interfaces you will need to set the mtu to 9216, set priority flow control mode to on, and set the access vlan, spanning-tree parameters, and qos features as shown.

SwitchA

switchA#conf t
switchA(config)#interface e1/1-2
switchA(config-if-range)#description FlashArray Ports
switchA(config-if-range)#mtu 9216
switchA(config-if-range)#switchport access vlan 986
switchA(config-if-range)#spanning-tree port type edge
switchA(config-if-range)#priority-flow-control mode on
switchA(config-if-range)#service-policy type qos input RoCE
switchA(config-if-range)#end
switchA#copy run start

Switch B

switchB#conf t
switchB(config)#interface e1/1-2
switchB(config-if-range)#description FlashArray Ports
switchB(config-if-range)#mtu 9216
switchB(config-if-range)#switchport access vlan 987
switchB(config-if-range)#spanning-tree port type edge
switchB(config-if-range)#priority-flow-control mode on
switchB(config-if-range)#service-policy type qos input RoCE
switchB(config-if-range)#end
switchB#copy run start

Configure Initiator Interfaces

For the initiator (host) interfaces you will need to set the mtu to 9216, set priority flow control mode to on, and set the access vlan, spanning-tree parameters, and qos features as shown.

Switch A 

switchA#conf t
switchA(config)#interface e1/3-4
switchA(config-if-range)#description FlashArray Ports
switchA(config-if-range)#mtu 9216
switchA(config-if-range)#switchport access vlan 986
switchA(config-if-range)#spanning-tree port type edge
switchA(config-if-range)#priority-flow-control mode on
switchA(config-if-range)#service-policy type qos input RoCE
switchA(config-if-range)#end
switchA#copy run start

Switch B

switchB#conf t
switchB(config)#interface e1/3-4
switchB(config-if-range)#description FlashArray Ports
switchB(config-if-range)#mtu 9216
switchB(config-if-range)#switchport access vlan 987
switchB(config-if-range)#spanning-tree port type edge
switchB(config-if-range)#priority-flow-control mode on
switchB(config-if-range)#service-policy type qos input RoCE
switchB(config-if-range)#end
switchB#copy run start

QoS Validation

Once you have completed the setup of the FlashArray and the initiator and have an active NVMe/RoCE connection between the devices, you should be able to see traffic in Unciast QOS Group 3 on the switch ports.  

Use the show queuing interface <ifname> command to verify that the traffic is being seen on queue 3.

switchA# show queuing interface e 1/1


slot  1
=======




Egress Queuing for Ethernet1/3/1 [System]
------------------------------------------------------------------------------
QoS-Group# Bandwidth% PrioLevel                Shape                   QLimit
                                   Min          Max        Units   
------------------------------------------------------------------------------
      7             -         1           -            -     -            9(D)
      6             0         -           -            -     -            9(D)
      5             0         -           -            -     -            9(D)
      4             0         -           -            -     -            9(D)
      3             0         -           -            -     -           (N/A)
      2             0         -           -            -     -            9(D)
      1             0         -           -            -     -            9(D)
      0           100         -           -            -     -            9(D)
+-------------------------------------------------------------+
|                              QOS GROUP 0                    |
+-------------------------------------------------------------+
|                           |  Unicast       |Multicast       |
+-------------------------------------------------------------+
|                   Tx Pkts |         2047204|          373178|
|                   Tx Byts |       197040266|        77061691|
| WRED/AFD & Tail Drop Pkts |               0|               0|
| WRED/AFD & Tail Drop Byts |               0|               0|
|              Q Depth Byts |               0|               0|
|       WD & Tail Drop Pkts |               0|               0|
+-------------------------------------------------------------+
|                              QOS GROUP 1                    |
+-------------------------------------------------------------+
|                           |  Unicast       |Multicast       |
+-------------------------------------------------------------+
|                   Tx Pkts |               0|               0|
|                   Tx Byts |               0|               0|
| WRED/AFD & Tail Drop Pkts |               0|               0|
| WRED/AFD & Tail Drop Byts |               0|               0|
|              Q Depth Byts |               0|               0|
|       WD & Tail Drop Pkts |               0|               0|
+-------------------------------------------------------------+
|                              QOS GROUP 2                    |
+-------------------------------------------------------------+
|                           |  Unicast       |Multicast       |
+-------------------------------------------------------------+
|                   Tx Pkts |               0|               0|
|                   Tx Byts |               0|               0|
| WRED/AFD & Tail Drop Pkts |               0|               0|
| WRED/AFD & Tail Drop Byts |               0|               0|
|              Q Depth Byts |               0|               0|
|       WD & Tail Drop Pkts |               0|               0|
+-------------------------------------------------------------+
|                              QOS GROUP 3                    |
+-------------------------------------------------------------+
|                           |  Unicast       |Multicast       |
+-------------------------------------------------------------+
|                   Tx Pkts |         1237976|            1025|
|                   Tx Byts |       119122616|          320022|
| WRED/AFD & Tail Drop Pkts |               0|               0|
| WRED/AFD & Tail Drop Byts |               0|               0|
|              Q Depth Byts |               0|               0|
|       WD & Tail Drop Pkts |               0|               0|
+-------------------------------------------------------------+
|                              QOS GROUP 4                    |
+-------------------------------------------------------------+
|                           |  Unicast       |Multicast       |
+-------------------------------------------------------------+
|                   Tx Pkts |               0|               0|
|                   Tx Byts |               0|               0|
| WRED/AFD & Tail Drop Pkts |               0|               0|
| WRED/AFD & Tail Drop Byts |               0|               0|
|              Q Depth Byts |               0|               0|
|       WD & Tail Drop Pkts |               0|               0|
+-------------------------------------------------------------+
|                              QOS GROUP 5                    |
+-------------------------------------------------------------+
|                           |  Unicast       |Multicast       |
+-------------------------------------------------------------+
|                   Tx Pkts |               0|               0|
|                   Tx Byts |               0|               0|
| WRED/AFD & Tail Drop Pkts |               0|               0|
| WRED/AFD & Tail Drop Byts |               0|               0|
|              Q Depth Byts |               0|               0|
|       WD & Tail Drop Pkts |               0|               0|
+-------------------------------------------------------------+
|                              QOS GROUP 6                    |
+-------------------------------------------------------------+
|                           |  Unicast       |Multicast       |
+-------------------------------------------------------------+
|                   Tx Pkts |               0|               0|
|                   Tx Byts |               0|               0|
| WRED/AFD & Tail Drop Pkts |               0|               0|
| WRED/AFD & Tail Drop Byts |               0|               0|
|              Q Depth Byts |               0|               0|
|       WD & Tail Drop Pkts |               0|               0|
+-------------------------------------------------------------+
|                              QOS GROUP 7                    |
+-------------------------------------------------------------+
|                           |  Unicast       |Multicast       |
+-------------------------------------------------------------+
|                   Tx Pkts |               0|               0|
|                   Tx Byts |               0|               0|
| WRED/AFD & Tail Drop Pkts |               0|               0|
| WRED/AFD & Tail Drop Byts |               0|               0|
|              Q Depth Byts |               0|               0|
|       WD & Tail Drop Pkts |               0|               0|
+-------------------------------------------------------------+
|                      CONTROL QOS GROUP                      |
+-------------------------------------------------------------+
|                           |  Unicast       |Multicast       |
+-------------------------------------------------------------+
|                   Tx Pkts |         1182973|               0|
|                   Tx Byts |        99449580|               0|
|            Tail Drop Pkts |               0|               0|
|            Tail Drop Byts |               0|               0|
+-------------------------------------------------------------+
|                         SPAN QOS GROUP                      |
+-------------------------------------------------------------+
|                           |  Unicast       |Multicast       |
+-------------------------------------------------------------+
|                   Tx Pkts |               0|               0|
|                   Tx Byts |               0|               0|
+-------------------------------------------------------------+


Per Slice Egress SPAN Statistics
---------------------------------------------------------------
         SPAN Copies Tail Drop Pkts                           0
         SPAN Input Queue Drop Pkts                           0
 SPAN Copies/Transit Tail Drop Pkts                           0
         SPAN Input Desc Drop  Pkts                           0




Ingress Queuing for Ethernet1/1
-----------------------------------------------------
QoS-Group#                 Pause                     
           Buff Size       Pause Th      Resume Th   
-----------------------------------------------------
      7              -            -            - 
      6              -            -            - 
      5              -            -            - 
      4              -            -            - 
      3          31200        10400         9568 
      2              -            -            - 
      1              -            -            - 
      0              -            -            - 




Per Slice Ingress Statistics
--------------------------------------------------------
Ingress Overflow Drop Pkts                           0






PFC Statistics
------------------------------------------------------------------------------
TxPPP:              30928486,   RxPPP:                    0
------------------------------------------------------------------------------
PFC_COS QOS_Group   TxPause             TxCount   RxPause             RxCount
      0         0  Inactive                   0  Inactive                   0
      1         0  Inactive                   0  Inactive                   0
      2         0  Inactive                   0  Inactive                   0
      3         3  Inactive            30928486  Inactive                   0
      4         0  Inactive                   0  Inactive                   0
      5         0  Inactive                   0  Inactive                   0
      6         0  Inactive                   0  Inactive                   0
      7         0  Inactive                   0  Inactive                   0
------------------------------------------------------------------------------
switchA#  

Verify that priority flow control is enabled on the interface with the show interface <ifname> priority-flow-control  command you should see Mode: On and Oper On for the interface indicating that PFC is Enabled and Active.  VL bmap of 8 indicates COS value 3

switchA# show interface e 1/1 priority-flow-control


slot  1
=======


============================================================
Port               Mode Oper(VL bmap)  RxPPP      TxPPP     
============================================================


Ethernet1/1          On   On  (8)       0          30928486        
switchA# 

When there is congestion in the network you can verify that PFC is working by using the show interface <ifname> priority-flow-control detail command to verify that pause frames are being sent to Priority3.  If the counters have not been reset on the host (Array/Initiator) or the switch, these values should match the values seen on the host.

switchA# show interface e 1/1 priority-flow-control detail


slot  1
=======




Ethernet1/1 
    Admin Mode: On  
    Oper Mode: On  
    VL bitmap: (8)      
    Total Rx PFC Frames: 0         
    Total Tx PFC Frames: 30928486         
    ---------------------------------------------------------------------------------------------------------------------
        |  Priority0  |  Priority1  |  Priority2  |  Priority3  |  Priority4  |  Priority5  |  Priority6  |  Priority7  |
    ---------------------------------------------------------------------------------------------------------------------
    Rx  |0            |0            |0            |0            |0            |0            |0            |0            
    ---------------------------------------------------------------------------------------------------------------------
    Tx  |0            |0            |0            |30928486     |0            |0            |0            |0            


switchA#