Microsoft Failover Clustering - Physical Disk Cluster Resource Default Timeouts
Physical Disk Cluster Resource Default Timeouts
Clustered Disk resources are used for Cluster Roles such as a File Server, or a SQL Server Failover Cluster Instance (FCI). You can view the settings by right clicking a disk resource and selecting properties.
Notice that by default, the Maximum restarts in the specified period is set to 1, and the Delay between restarts is set to 0.5 which is half of a second. These settings are fine for most Pure FlashArray customers, though we have seen in complex configurations that add a little more latency to the stack, where a 0.6-2.5 second outage can cause the disks, and the cluster roles that own them, to failover during FlashArray upgrades. These rare configurations include iSCSI in the VM directly connecting to the FlashArray, and storage network configurations that are complex involving many hops. Changing these settings on the individual disk resources so that the Maximum restarts in the specified period multiplied by the Delay between restarts is 5 seconds or larger, has eliminated the resource failover in production during FlashArray upgrades.
In this second image of Cluster Disk 4 Properties the Maximum restarts in the specified period is set to 5, and the Delay between restarts is set to 1, which provides a total duration of 5 seconds, versus the default of 0.5 seconds.
Cluster disks can also be promoted to a Cluster Shared Volume (CSV), and CSV disks do not have these settings available. CSV disks are not vulnerable to this because all nodes connect to the disk and have write access simultaneously. In the event one node is unable to connect to the storage, that node will connect to the storage through another node.