Skip to main content
Pure Technical Services

VMware & iSCSI: FAQs

Currently viewing public documentation. Please login to access the full scope of documentation.

KP_Ext_Announcement.png

Is dynamic discovery supported with the FlashArray?

Yes, with a single caveat. If you are utilizing CHAP within your environment then you must use static discovery. 

Is routed iSCSI supported with the FlashArray?

Yes, the Pure Storage FlashArray supports iSCSI routing. While it is supported, it is not recommended.

The following are reasons why routed iSCSI is not recommended:

  1. Introduces complexity in configuration. This is especially true if jumbo frames and multiple link speeds (1GbE / 10GbE) are in use. 
  2. Introduces complexity in troubleshooting. Often times having a routed network results in longer troubleshooting to isolate where the problem is originating from. This is due to more components, paths and devices to review during times of unexpected behavior.
  3. Introduces potential latency to your network. The more hops you take the more time it takes to communicate back and forth. One of the reasons for buying a Pure Storage FlashArray is for speed. Thus if you have a slower routed network you may not be taking full advantage of the FlashArray.
  4. Introduces potential security concerns. While the FlashArray encrypts data at rest, I/O sent over the wire is likely clear text, with the exception being if CHAP is in use. Utilizing a closed network is recommended whenever possible.

If you need to use iSCSI routing our recommendation is to utilize VMware vSphere 6.5U3 and later. The underlying reason here is due to VMware's support for routed networks with iSCSI port binding. 

NIC Teaming vs Port Binding, which should I use?

This is a good question and one that warrants an in-depth answer.

What is NIC Teaming with VMware?

At a basic level NIC teaming is what allows you to group / associate 2 or more physical network interface cards (NIC) with a virtual switch to increase network bandwidth, provide link failover capabilities, and redundancy.

It is import to understand that everything you are configuring is at the network level. So you will still get the ability of link failover due to cable / port failures, increased performance for load balancing, etc. The problem though is that the Pluggable Storage Architecture  (PSA) is only going to get information on logical paths, not physical. So while you have two physical NICs (pNICs) available for redundancy, that information is not directly exposed to the PSA. Thus the ability to leverage SCSI sense codes to failover and manage physical paths is no longer available, which means the decisions being made by the Native Multipathing Plugin (NMP) are not as effective.

To illustrate this let's go over an example of where this can be problematic. If we imagine your ESXi host utilizing NIC teaming and a problem arises due to a degraded cable on one of the NICs. In this scenario we would expect to see SCSI sense code errors being reported to and from the FlashArray due to performance problems. It wouldn't be uncommon to see ABORT_TASK, CTIO TIMEOUT, and other critical errors in this type of situation that would indicate problems with a specific adapter. So while the ESXi host understands there is a problem with these logical paths, it doesn't have the ability to failover from one pNIC to another (that part is obscured). It would only failover to the other pNIC if it were to lose link connectivity or beacon probing were to fail, just depending on how teaming was configured. Thus the lack of decision making in failing over often times can lead to an admin manually intervening to resolve issues.  

The main part to understand here is that with NIC teaming you are covered at a network level for network problems. One of the benefits people attach to here is that physical link failing would ultimately result in a much more seamless failover (i.e. very small pause in I/O, if any). This is because that is a clear-cut, easily recognizable, network event. So it is as simple as marking the one pNIC dead and continuing the I/O down the alternately available pNIC. The problem though is that you are utilizing iSCSI and thus it makes more sense to base decisions off of storage multipathing rather than just network availability. 

What is port binding with VMware?

This term can actually be a little bit misleading as initially you might think of it as a link aggregation or something of that nature. That is not the case and should really be referred to as "iSCSI Multipathing". In this configuration you are associating a single pNIC to a single vmkernel adapter (vmk) and then binding them for iSCSI use only. You can configure one or multiple vmkernel ports to be dedicated to iSCSI. Obviously, for the sake of redundancy and performance, a minimum of two should be configured. 

The important takeaway here is that when iSCSI multipathing is used you are providing all of the information into the hands of the PSA / NMP. This means that your ESXi host is now able to make smarter decisions based off of storage related events and not just network related events. This includes the ability to failover physical paths based off of SCSI sense codes unlike NIC teaming.

Another important note about port binding is that it will be more reliable when utilizing Round Robin with latency based policy on your devices. This is for the same reasons outlined previously. 

So which should you use?

The answer here is clear, port binding whenever possible. This is not only recommended as a best practice with Pure Storage but by VMware as well. An in-depth review of VMware iSCSI best practices can be found here. Utilizing the MPIO storage stack to its full potential for storage events just makes sense. This provides better protection, better decision making, and at times better performance for your environment. 

Unfortunately, there are some restrictions around port binding that make it unavailable in some environments. These are clearly outlined by VMware but let's put them down below just for a quick recap:

- iSCSI networks that are on different IP networks / broadcast domains

- If routing is required (ESXi 6.0 and lower)

- If you are utilizing LACP / link aggregation on the physical NICs

You can review the: Considerations for using software iSCSI port binding in ESX/ESXi KB for in-depth information. 

While port binding is the preferred method for connectivity it is import to be clear here, Pure Storage and VMware both support NIC teaming. If your circumstances prevent you from implementing port binding you will still be fully supported by both companies and we'll investigate any issues that arise with the same priority and urgency as if port binding was in use.

Should I enable jumbo frames with iSCSI?

The general recommendation is to use the standard MTU of 1500 for iSCSI connectivity.

This recommendation is predicated upon several things:

  1. Simplicity. Enabling jumbo frames requires setting the proper MTU throughout the entire network. This means the vSphere Switch, vmkernel port (vmknic), physical NIC (pNIC), physical switches, routers (if routed iSCSI), and finally the FlashArray target ports. It is an all too common tale to see one or more of these components missed and thus problems with stability or performance are reported. 
  2. Not all environments benefit from jumbo frames. This was at one time a common (and rather heated) discussion in previous years. The anthem was almost always "jumbo frames enabled for best performance". The reality though is actually based upon the workload between the initiators and target. If your applications / environment are consistently sending larger I/O requests than there is a good chance jumbo frames could help. How much will it help? Well, that answer can vary greatly so we won't go into that here. The caveat though is that if the opposite is true (mostly smaller I/O requests), it can actually result in a performance penalty in your environment. If your host is waiting around to fill up a jumbo frame with smaller I/O requests then you are actually delaying transmission of your I/O and thus a slight performance penalty can be noted. How much? Again, it varies and isn't the scope of this document.

The key takeaway here is know your environment. If you find jumbo frames are optimal for your environment please have all proper parties involved from end-to-end to ensure everything is implemented correctly.

If you decide to implement jumbo frames, the following command is vital to ensure you have properly configured your environment end-to-end:

vmkping -I <iscsi_vmk_interface> -d -s 8972 <ip_addr_of_target>

This ensures packets are not fragmented during the ping test (-d) and tests jumbo frames (-s 8972).