In 2017 Pure Storage introduced the wildly successful ActiveCluster feature, active-active synchronous replication over TCP/IP , with the release of Purity 5.0. Along with that success came the realization that many customers also wanted to leverage existing Fibre Channel estates to provide the replication connectivity for ActiveCluster. We’ve now obliged those customerwith Purity 6.1 and ActiveCluster the AC over FC. Now customers can utilize the reliability, security and performance advantages of fibre channel for ActiveCluster replication.
Though the transport method might be different, the deployment and configuration of ActiveCluster over Fibre Channel (AC/FC) is nearly identical to the ActiveCluster everyone is already familiar with.
In sticking with our commitment to performance and simplicity, AC/FC has the following highlights:
- AC/FC uses NVMe instead of SCSI for more efficient transport
- Auto-discovery of remote array replication WWNs
- A detailed connection topology between arrays showing redundancy, path status and health
- Purity now displays detailed port statistics for all interfaces, allowing for monitoring or troubleshooting of the replication network. This includes real-time updating and historical, graphed data with a 1 year look back history
- Fibre Channel agnosticism with regards to distance technology, number of fabrics, or FC routing
Currently, it is not supported to convert ActiveCluster over IP to ActiveCluster over FC. Support for converson is coming in a future Purity release.
This document assumes a high level understanding of ActiveCluster, including, but not limited to concepts such as host access (uniform and non-uniform), what a pod is, and how the mediator works.
If you are new to ActiveCluster, please familiarize yourself with ActiveCluster before deployment:
Solution Overview, a concise introduction to ActiveCluster
Quick Start Guide, high level overview of setting up ActiveCluster
ActiveCluster Requirements and Best Practices, a more detailed discussion of host configuration
Planning and Design Guide which discusses management network requirements, host access, and ActiveCluster features such as our cloud mediator and preferred arrays.
- Purity 6.1 and newer
- FlashArray Model //XR2 or newer
- Two dedicated fibre channel ports per controller (four per FlashArray) are required. The fibre channel ports used for replication can either be existing fc ports or add-on ports provided by installing dual port fibre channel HBAs in each controller.
- Brocade or Cisco switches
- Arrays cannot be directly connected,
- FC Switches must exist between arrays
- FC switches should be 16Gb or newer. Compatibility with older switches might be limited
- WAN latency cannot exceed 5ms (2ms for the x20r2 and x50r2) Distance is not a limitation, only the latency between sites
- Maximum latency is specific to the 6.1.0 release. Maximum supported round trip latency will be increased in an upcoming release
- FlashArray cannot simultaneously utilize ActiveCluster IP and FC replication links, only 1 protocol per array
Deployment and Configuration
Deploying ActiveCluster over Fibre Channel consists of three steps:
- Select and enable the FC ports
- Zone the ports
- Connecting arrays
Any two FC ports on the FlashArray can be used, but the typical deployment will consist of a dual ported HBA installed into PCI slot 2. As seen below, this is the top right most PCI slot with fibre channel ports FC8 and FC9.
Once the ports are installed or selected, contact support to enable ActiveCluster over Fibre Channel and provide your chosen ports.
All replication WWNs must go into a single zone. For example, if there are 8 WWNs total, 4 for each array, then in a single fabric there should be 1 zone with 8 WWNs.
The local array has the following replication interfaces and WWNs:
The remote array has the following replication interfaces and WWNs:
If both arrays are in a single fabric, the zone would look like the following:
REPLICATION_zone 52:4A:93:72:C4:E9:2C:08 52:4A:93:72:C4:E9:2C:09 52:4A:93:72:C4:E9:2C:18 52:4A:93:72:C4:E9:2C:19 52:4A:93:73:F0:BD:EB:08 52:4A:93:73:F0:BD:EB:09 52:4A:93:73:F0:BD:EB:18 52:4A:93:73:F0:BD:EB:19
If using dual fabrics, where all 8 ports are split between 2 fabrics, then each fabric should have a single zone with 4 WWNs per zone.
More specifically, each array port must be able to communicate with both controllers on the remote array for maximum resiliency.
Blue represents Fabric A
Purple represents Fabric B
The actual ports do not matter, what is important is to zone and connect the arrays in such a way that each port on each array can see at least one port on each controller on the remote array.
The switch zoning configuration might look like the following:
The FlashArray will always provide feedback if there are any deficiencies under Health -> Connections -> Array Connections. This assists in confirming that all ports are properly configured.
In the above image, you can see that the Local Array is missing FC8 from CT1. When properly zoned and connected, it will look like the following:
After selecting and zoning the replication ports, it’s time to connect both arrays.
Go to the remote array and copy the array key
pureuser> purearray list --connection-key Connection Key Ae707eda-ca7b-872d-4c11-1d61d12ae8f2
- Connect arrays:
pureuser> purearray connect --management-address 10.31.0.89 --type sync-replication --replication-transport fc --connection-key Enter the connection key of the target array: 85a39a84-fb7a-e2c7-90fb-f93f22f3d4ff Name ID Version Management Address Replication Address Status Throttled Type Transport tmefa12 d0441d28-df41-4b21-865a-2d3f0d0c858e 6.1.0.beta4 10.31.0.89 52:4A:93:73:F0:BD:EB:08 connected False sync-replication fc 52:4A:93:73:F0:BD:EB:19 52:4A:93:73:F0:BD:EB:09 52:4A:93:73:F0:BD:EB:18
Acquire array key from the remote array
Storage -> Array Connections -> Get Connection Key
Copy Connection Key
Create a new Array Connection
Storage -> Array Connection -> click on the plus sign to add an array
In the dialog box, paste in the remote array Connection Key, enter the remote array Management Address, choose Sync Replication from Type and finally, choose Fibre Channel (FC) for Replication Transport.
You will see a complete Array Connections box after:
Replication Network Monitoring,Reporting, and Troubleshooting
Purity 6.1 introduces a new tool for monitoring and reporting port statistics. Let’s walk through a troubleshooting example for an overview of its capabilities.
Example: Poor replication performance
The array’s replication interfaces are the same on both controllers: FC8 and FC9. These are the ports we’ll focus on.
Go to Health -> Network and select Fibre Channel:
In the field below Ports, enter “8” to review CT0.FC8 and CT1.FC8
Note that Summary is selected and that the table provides live updates. There are no errors reported as seen in the right most column.
Enter “9” to review the other replication ports CT0.FC9 and CT1.FC9.
CT0.FC9 is reporting errors, 9 at the moment this screenshot was taken.
Click on Fibre Channel Errors for a summary of those errors.
We have 2 instances of Loss of Sync received and, and a moment later, 11 instances of Invalid Words transmitted.
Without an understanding of what this means, you have some helpful data to relay to support for assistance. You can likely exonerate your application, hosts, host network and the array itself and focus on the replication network.
Or, after becoming familiar with what these errors mean, through the inline FlashArray manual, you might safely assume that there is a SFP or a cable problem. You can replace the SFP and/or the cable and resume monitoring.
Perhaps a little further confirmation would be preferred before visiting your data center. Let’s review the port error history.
Scroll down for port history. The top most graph records Fibre Channel Errors/s.
Using the zoom period menu, you can review the last hour (most granular) up to a year back. Below Fibre Channel Errors/s is Fibre Channel Bandwidth, allowing you line up peak bandwidth demands with the error rate
Select your preferred date and time:
Note that once we clicked on the graph, we received the recorded metrics for that specific date and time. The graph shows 5 errors, and in the table above it we can see 1 Loss of Sync and 4 Invalid Words. This further confirms a problem between CT0.FC9 and the switch it connects to, perhaps justifying that data center visit and giving some scope as to the work needed.
Network Health is a tool for monitoring and troubleshooting, providing granular controls and live updating. It’s one of many possible tools, and troubleshooting network problems can be exceedingly complicated. Network Health isn’t intended to diagnose problems, nor does it provide guidance for remediation. If you are unfamiliar with these metrics, do use the data to help Pure Storage Support expedite your support request.