Best Practices for Microsoft SQL Server on FlashArray
Table Of Contents
This applies to all supported versions of SQL Server per the Microsoft Support Matrix. If anything is version-specific, it will be called out.
Introduction
Whether you are running Microsoft SQL Server on physical servers or in a virtual machine, this document will cover the generic best practices that impact deployments on FlashArray. Additional and updated best practices for virtualized deployments of SQL Server with Hyper-V and VMware on FlashArray are forthcoming and will be linked here once available.
Best practices for SQL Server in Azure using Cloud Block Store (CBS) can be found in "Best Practices for Microsoft SQL Server on Azure with Cloud Block Store".
More information on SQL Server with ActiveCluster can be found in the paper “Pure Storage ActiveCluster with SQL Server”
Best practices for SQL Server using ActiveDR can be found in the section "SQL Server" in “ActiveDR and Microsoft Solutions”.
Adhere to general SQL Server best practices. This article is focused mainly on the ones that are important for FlashArray. For example, options such as maximum degree of parallelism, instant file initialization, file autogrowth, and tasks such as sizing files may be something you need to worry about, but nothing Pure would provide guidance for as they may be specific to your implementations.
Configuration
At the FlashArray level, a Volume must be created and presented to the OS for use with SQL Server. There is nothing specific to SQL Server that must be done when creating the Volume. Configuring the Volume in the OS affects performance and possibly other features in FlashArray such as snapshots. This section will cover the configuration aspects of FlashArray for SQL Server.
Formatting Volumes
The operating system may refer to the Volume presented as a disk. If this is a physical host or a virtual machine using some sort of connectivity that bypasses the hypervisor such as an RDM in VMware, virtual fibre channel in Hyper-V, or in-guest iSCSI, the disk is a Volume that is created in the FlashArray and allowed to access it. This section will cover the best practices for formatting volumes for use with SQL Server.
File System Choice
Different operating systems support different file systems for use with SQL Server deployments.
Linux
Microsoft supports limited distributions of Linux and for each distribution, a select list of file systems. Consult the section "Supported platforms" in the Microsoft Learn topic "Installation guidance for SQL Server on Linux". If the distribution and/or the file system is not listed, it is not supported by Microsoft in any capacity. Pure only supports SQL Server configurations that are supported by Microsoft.
Windows Server
Windows Server supports both NTFS and ReFS for a file system choice when formatting. Pure supports both as well. Consult the "File Systems" section of the Microsoft Support Matrix for any caveats with FlashArray.
Disk Type
With Windows Server, a disk can be formatted as either Master Boot Record (MBR) or GUID Partition Table (GPT). Pure supports both but strongly recommends GPT. For more information on these two disk types, see the section "Compare partition styles - GPT and MBR" in the topic "Initialize new disks" in Microsoft Learn. This can be set via GUI, Diskpart via the convert command after bringing the disk online, and the Initialize-Disk PowerShell cmdlet with the -PartitionStyle
parameter.
Figure 1 shows setting the value to GPT while initializing in the Disk Management GUI.
Figure 1. Initializing the disk as GPT
Allocation Unit Size
With Windows Server, a disk can be formatted with a specific allocation unit size, sometimes referred to as block size or bytes per cluster. By default, Windows Server will use 4 KB which is not optimal for SQL Server. The SQL Server best practice is to format at 64 KB which is valid for both NTFS and ReFS. However, understand that an I/O's block size (covered above) and the file system's allocation unit size are two different things. Do not confuse the two concepts to be the same.
Figure 2 shows what the option looks like in the GUI-based Wizard.
Figure 2 Allocation unit size set correctly when formatting
To verify the allocation unit size of a Volume/disk in Windows Server, it can be done via PowerShell or command line. For either of these methods, if the value is something other than 65536 (64 KB) such as 4096, the Volume was formatted incorrectly.
The value is represented by the AllocationUnitSize
property of a disk returned by the Get-Volume
cmdlet. This is one way to use the cmdlet:
(Get-Volume | Where {$_.DriveLetter -eq 'driveletter'}).AllocationUnitSize
where driveletter is the drive letter of the Volume to be interrogated. An example is shown in Figure 3. Allocation unit size is represented by the AllocationUnitSize
property of a Volume returned by the Get-Volume
cmdlet.
Figure 3. Using Get-Volume
to see allocation unit size for a specific drive letter
The command to check this value in a command line is
fsutil fsinfo ntfsinfo driveletter
where driveletter is the drive letter of the Volume to be interrogated. An example is shown in Figure 4. Allocatiion unit size is represented as Bytes Per Cluster.
Figure 4. Using fsutil
to see allocation unit size
It is crucial to ensure the allocation unit size is implemented correctly when formatting before installing SQL Server, creating databases, and storing data. Changing the allocation unit size cannot be done without reformatting the Volume.
Formatting Command Best Practices
- Use the
/L
(format
command) or-UseLargeFRS
(Format-Volume
cmdlet) to create Volumes with large File Record Segments. This option is not available in the Wizard if using the GUI. - Specify the allocation unit size with
/A:64K
(format
command) or-AllocationUnitSize 65536
(Format-Volume
cmdlet)
It is important to understand the difference between allocation unit size and a SQL Server I/O.
An 8 KB page is the fundamental unit of data storage in SQL Server. SQL Server's page size is not configurable. An extent is eight physically contiguous pages, or 64 KB. To fully understand how SQL Server reads and writes, reference the topic "Pages and extents architecture guide" in Microsoft Learn. Also core to undertanding how I/O works in SQL Server is how indexes work. For more information, consult the topic "SQL Server and Azure SQL index architecture and design guide" in Microsoft Learn.
Reads and writes are both important to SQL Server workloads. SQL Server performs different kinds of I/Os related to both reads and writes. The table below shows the type of operation and its I/O size.
I/O Type | I/O Size |
---|---|
Transaction log write | 512 bytes-60 KB |
Checkpoint/Lazy writer | 8 KB-1MB |
Read-ahead scans | 128-512 KB |
Bulk Inserts | 256 KB |
Backup/Restore | 1 MB |
Columnstore Read-Ahead | 8 MB |
File initialization | 8 MB |
In-memory OLTP Checkpoint | 1 MB |
Understanding your workload along with baselines and benchmarks helps with understanding performance and configuration. No two workloads are the same. For example, some are write heavy which means not only the data but the transaction log, while others see occasional writes with heavy reads.
Writing to the transaction log is one of the more important elements of a SQL Server workload. If that process encounters issues, it can have a cascading effect on availability and performance. It is generally not recommended to change the transaction durability which would affect recovery if changed to delayed. Writes should be persisted, also known as hardened, immediately. Transaction log writes are sector aligned. For FlashArray, this value is 512 bytes. SQL Server will perform log flushes that can range anywhere from 512 bytes to 60 KB to ensure write efficiency.. 60 KB is the maximum size of a SQL Server log flush from the log buffer.
SQL Server Data and Log Files
Each system or user database is comprised of at least one data file and a transaction log file. The main data file has an extension of .mdf and additional data files have an extension of .ndf. Outside of extremely rare situations, there should only be a single transaction log file per database which has an extension of .ldf.
Optimizing the configuration of SQL Server data and transaction log files is crucial for both functionality and performance. This section will cover those aspects as they relate to FlashArray.
File Placement
Where you place data and transaction log files depends on quite a few factors, many of which are specific to your environment. There are five tenets that must be adhered to when it comes to storage layout with FlashArray:
- Performance
- Capacity
- Use of snapshots and clones
- Recoverability (and related to #3 if those are used)
- Manageability
Those are not in a particular order. For example, with manageability, creating large numbers of logical disks and physical volumes in some environments can be problematic. Windows Server has a limited number of drive letters before you need to consider other options such as shares or mount points.
Bottom line: whether you place all data files on one disk, all transaction log files on another, split everything out onto their own disks, or do something completely different, the solution has to ensure that it performs well, is manageable, and can be recovered.
Number and Size of Files
Along with file placement, how many data files must be considered. That depends on the workload. The application should be tested to not only develop a baseline and benchmark, but to determine the proper number of data files for maximum performance. The number and types of files affects the final layout.
tempdb
While the other system databases are important for SQL Server availability, reliability, and day-to-day operations, tempdb is a special case that has additional considerations. Some built-in functionality of SQL Server uses tempdb such as online index rebuilds. Some application workloads may use tempdb as a "scratch" space; others may not. A key aspect of tempdb configuration is understanding how it is used in your environment.
As with user databases, having the right number of data files is crucial. For guidance on tempdb, it is covered in the Microsoft Learn topic "tempdb database" which also has a link to another relevant Microsoft Learn topic "Recommendations to reduce allocation contention in SQL Server tempdb database".
There are specific recommendations for tempdb when ActiveCluster, ActiveDR, and/or CBS are used with SQL Server deployments.
Recommendation | |
ActiveCluster | This should be a uniform configuration, so all database and transaction log files including tempdb should be in the Pod. |
ActiveDR | Do not place tempdb in the Pod. It helps minimize replication traffic. The other system databases can be placed in the Pod. If they are not, the relevant administrator must manually sync instance-level objects between the source and the destination. |
Cloud Block Store | See the section “tempdb Best Practices in Azure” in the CBS document linked earlier. |
Backup, Restore, and Snapshots
This section will cover best practices related to backing up and restoring SQL Server databases.
Protection Groups
If FlashArray features such as snapshots and replication are in use, related data and transaction log Volumes should be placed in a protection group. A protection group defines members (which can be Volumes, hosts, or host groups) that can be protected together point-in-time consistency. The members within the protection group should have common data protection requirements and the same snapshot, replication, and retention schedules.
For more information on protection groups, refer to the administration guide for the version of Purity in use. Both the CLI and GUI version of these guides can be found at this link.
Native SQL Server Backup Compression
SQL Server has built-in compression for native backups that is available in both Enterprise and Standard Editions of SQL Server (as well as Developer). By default, this is not set enabled when an instance of SQL Server is installed. Backup compression will save space in the file at the tradeoff of CPU.
FlashArray has a built-in data reduction rate (DRR) technology that encompasses both compression and deduplication. If using the native SQL Server backup compression and generating native SQL Server backup files to FlashArray, there will be no additional compression. If native SQL Server backup compression is not enabled, FlashArray’s data reduction will not only compress, but if applicable, will deduplicate except for changed blocks.
SQL Server databases using encryption or compression still benefit from the data reduction technologies in FlashArray even if the DRR number is a lower ratio.
Native SQL Server Backup Performance
For FlashArray, the current recommended setting for the I/O size while generating a backup or restoring a database using native SQL Server backups is 64 KB. This is configured using the MAXTRANSFERSIZE
option of the Transact-SQL BACKUP
and RESTORE
statements. An example for generating a full database backup is shown below.
BACKUP DATABASE 'DBName'
TO DISK = 'ValidPath\DBName.bak'
WITH MAXTRANSFERSIZE=65536
Backup and restore performance can also be further tweaked with other parameters such as BUFFERCOUNT
and BLOCKSIZE
. Tuning BACKUP
and RESTORE
statements is something that is unique to each environment.For more information, see the Microsoft Learn topics "BACKUP (Transact-SQL)", "RESTORE Statements for Restoring, Recovering, and Managing Backups (Transact-SQL)", and "RESTORE Statements - Arguments (Transact-SQL)".
Always test any settings before using in production as they may need to be adjusted for your environment.
Snapshots and SQL Server
Snapshots generated via FlashArray work with SQL Server. FlashArray can generate two kinds of snapshots: application and crash consistent. The table below shows how each kind of snapshot can be generated with FlashArray.
Pure Storage FlashArray Management Extension for SQL Server Management Studio | Pure Storage FlashArray PowerShell SDK | PureStorage.FlashArray.Backup PowerShell Module | Purity CLI | Purity Web GUI | |
Application Consistent | Yes* | Yes** | Yes** | No | No |
Crash Consistent | No | Yes | Yes | Yes | Yes |
* Requires the Volume Shadow Copy Service (VSS) and the VSS Hardware Provider which is installed with the extension
** Requires the use of the Transact-SQL Snapshot Backup feature in SQL Server 2022 or later. For more information, see the Microsoft Learn topic "Create a Transact-SQL snapshot backup".
Both snapshot types can be used with SQL Server. Here are the snapshot considerations with FlashArray for SQL Server:
- Crash consistent snapshots can be used in any scenario for database recovery assuming all database files are on volumes in the protection group that is snapped. The recovery point of a crash-consistent snapshot is what is in the snapshot and on its own, cannot be rolled forward to a further point in time.
- Snapshots can be used to restore SQL Server to a point in time. However, achieving a granular recovery requires:
- The user database(s) set to the
FULL
orBULK LOGGED
recovery models - There are transaction log backups to roll an application consistent snapshot forward.
- The user database(s) set to the
- Array-based snapshots are application consistent in the following scenarios:
- When the Volume Shadow Copy Service (VSS) is used
- Combining crash consistent snapshots with the Transact-SQL snapshot backup feature in SQL Server
- Any application consistent snapshot can be rolled forward to achieve point-in-time recovery with transaction log backups if they are available.
- Snapshots can be generated by the Pure Storage FlashArray Management Extension for SQL Server Management Studio (SSMS), Pure Storage FlashArray PowerShell SDK, or the PureStorage.FlashArray.Backup PowerShell Module.
- When using Always On availability groups (AGs), only take a snapshot of the primary replica (i.e. the instance that has the full read/write database), not on a secondary replica (the one receiving the transactions).
- When using log shipping, similar to AGs, only take a snapshot of the instance with the live database, not the warm standby instance(s).
- If there is a large portion of the log that is active and/or a large number of uncommitted transactions, the time it takes to create a clone of that database generated from a snapshot may take longer to come online due to SQL Server’s crash recovery process. To mitigate this condition, consider using ActiveDR.
Transparent Data Encryption
Securing data stored in a database is core to all administrators. SQL Server has some built in features that enhance security, one of which is Transparent Data Encryption (TDE). If TDE is enabled for one or more databases, it impacts DRR because data cannot be compressed or deduplicated. If clones of a TDE-enabled database exist on the same FlashArray, there will be data reduction since the databases are physically the same. The initial copy will not be data reduced. This also applies to database backups for databases with TDE enabled and without compression since they will also data reduce.
If TDE is a requirement, enable row and/or page level compression in SQL Server. For more information, see the topic "Data compression" in Microsoft Learn.