Running MySQL On FlashArray
Preface
According to DB-Engines (https://db-engines.com/en/ranking) MySQL is the second most popular database. MySQL is highly portable with binaries available for Microsoft Windows™, Apple MacOS™, various Unix flavors and Linux distributions. MySQL database is well suited for deployments on shared storage block devices such as Pure Storage® FlashArray™ where high I/O performance is required and advanced data services provide fast and convenient means of database protection, cloning and recovery - for details see MySQL Database Cloning With FlashArray Copy and Protecting MySQL Database With FlashArray Snapshots articles.
For database administrators, MySQL offers a rich set of configuration parameters to manage and improve many aspects of the application behavior and performance. As mentioned earlier, MySQL is available on multiple operating systems but the majority of the database installations are on Linux. The objective of the test was to determine performance impact of the database parameters changes rather than focus on an absolute performance results. This article provides suggested MySQL configuration values for databases located on FlashArray volumes and are based on the tests executed at Pure Storage laboratories.
Establish general performance reference point by monitoring and collecting performance data.
Test Environment
The test bed was constructed with a single server running Linux CentOS 7.7 and FlashArray //X. The MySQL version 8.0.19 (with InnoDB engine) host was equipped with a 25 Gb/s dual port Ethernet device for NVMe-oF RoCE server to FlashArray connectivity. Each FlashArray //X controller also had a single, dual port 50 Gb/s NVMe-OF RoCE interface - see Figure 1.
Figure 1.
FlashArray also supports other protocols such as Fibre Channel and iSCSI.
For host to storage connectivity see the output of nvme list-subsys and multipath -ll commands shown in Figure 2 and Figure 3.
Figure 2.
Figure 3.
Host ports were connected to both FlashArray controllers resulting in four paths to the disk. See Figure 4.
Figure 4.
Operating System
The recommended operating system FlashArray settings for Windows are available in the Microsoft Platform Guide. Linux recommended settings are listed in the Linux Reference section as well as NVMe RoCE Initiator Setup. Additionally, Linux HugeTLB memory pool was enabled and configured - see https://dev.mysql.com/doc/refman/8.0/en/large-page-support.html article on how to properly enable and utilize Linux large page support.
Data Volumes
Placing data files and logs on different devices and volumes has been a long established practice for relational database management systems. Separating those volumes may be desirable when more granular storage control is required. Multiple volumes allow different snapshot and replication schedules, Quality of Service (QoS) settings and may ease disaster recovery operations. During our testing, separating data and log volumes on FlashArray had not resulted in improved database performance.
File Systems
The XFS file system has become the de-facto standard for the majority of Linux distributions. XFS offers multiple mount options with nobarrier and noatime frequently cited as possible settings to improve the file system and consequently database performance. In our tests, mounting database volumes with nobarrier and noatime options has not delivered measurable performance improvement.
Test Methodology
All the tests were executed using sysbench version 1.0.19. Sysbench is a freely available performance benchmarking tool with MySQL driver already compiled-in. The tool is also scriptable and multiple database workloads are readily available. For this set of tests, a tpcc-like set of scripts from Percona (https://github.com/Percona-Lab/sysbench-tpcc) was configured.
Running the benchmark consists of the following steps:
-
Create Database
-
From mysql prompt, run ‘create database <database_name>;’ command
-
-
Prepare Tables and Data
-
From shell, run ‘sysbench <path_to_tpcc.lua>/tppc.lua --threads=<threads> --tables=10 --scale=100 --db-driver=mysql --mysql-db=<database_name> --mysql-user=<mysql_user> --mysql-password=<password> prepare'
-
For example:
-
/usr/bin/sysbench /usr/share/sysbench/tpcc.lua --threads=48 --tables=10 --scale=100 --db-driver=mysql --mysql-db=sysbench10x100W --mysql-user=admin --mysql-password=’password’ prepare
-
Execute Benchmark
-
From shell, run ‘sysbench <path_to_tpcc.lua>/tppc.lua --threads=<threads> --time=<run_time[s]> --tables=<number_of_tables> --scale=<number_of_warehouses> --db-driver=mysql --mysql-db=<database_name> --mysql-user=<mysql_user> --mysql-password=<password> run > <output_file>'
-
For example:
-
/usr/bin/sysbench /usr/share/sysbench/tpcc.lua --threads=48 --time=300 --tables=10 --scale=100 --db-driver=mysql --mysql-db=sysbench10x100W --mysql-user=admin --mysql-password=’password’ run > ~/results/5min/r3/32GBP10x100WFA48.out
Our test database consisted of 100 warehouses with 10 tpcc datasets. The test set included 11 runs for 8, 16, 24, 32, 48, 64, 96, 128, 256, 512, 1024 threads (users) to better estimate the impact of configuration changes. Each sysbench test ran for 5 minutes on MySQL server. There was a 30 seconds sleep time (pause) between the runs. Sample sysbench output is shown in Appendix A.
MySQL Server Configuration
The size of the sysbench tpcc-like generated database (10 sets with 100 warehouses) was approximately 100 GB. To simulate IO bound database server MySQL buffer pool was set to 32 GB.
By default, MySQL database configuration parameters are in the /etc/my.cnf file - see sample my.cnf in Appendix B.
Default (included with installation) /etc/my.cnf file is NOT suitable for production level MySQL and FlashArray deployments.
Changing some of the MySQL server database parameters may improve performance at the expense of transaction atomicity and data recoverability.
Understand the parameters, their values and potential impact before making any changes.
Use test systems for configuration changes before applying changes to production systems.
The MySQL parameters with potential IO impact and the suggested values are listed in Table 1.
Parameter Name | Description | Default Value | Suggested Value |
---|---|---|---|
innodb_checksum_algorithm | Tablespace page checksum algorithm. | crc32 | crc32 |
innodb_doublewrite | Write data to doublewrite buffer before writing to data file. | On |
Data integrity - set to On Performance - set Off |
innodb_flush_log_at_trx_commit | Controls log flushing and writing. | 1 |
Data integrity - set to 1 Performance - set to 2 |
innodb_flush_neighbors | Same extent buffer pool and other dirty pages flushing. Extent is a group of pages within a tablespace. | 0 (zero) - disabled) | 0 (zero) - disabled |
innodb_flush_method |
Data (buffer pool) and log files (redo, undo) write to disk (flush) method; O_DIRECT is used during flushing IO skipping fsync() after each write operation. |
fsync |
O_DIRECT_NO_FSYNC |
innodb_idle_flush_pct | Buffer pool flushing during idle period as a percentage of innodb_io_capacity. Value of 100 does not limit page flushing when InnoDB is idle. | 100 | 100 |
innodb_io_capacity | Number of available IOPS to background tasks (dirty page flushing, merging data from the change buffer) | 200 | ≥ 10000 |
innodb_io_capacity_max | Number of maximum available IOPS to background tasks. | 100 | ≥ 20000 |
innodb_log_compressed_pages | Redo log page compression (for table compression only) | On | Off |
innodb_log_file_size | Size of the log file. | 50 MB (503331648 bytes) | ≥ 1 GB (see Notes) |
innodb_log_files_in_group | Number of log files in the log group. | 2 | ≥16 (see Notes) |
innodb_use_native_io | Use asynchronous I/O - Linux only. | On | On |
Table 1.
Notes
The size and number of MySQL redo logs are controlled by innodb_log_file_size and innodb_log_files_in_group parameters. The size of the redo log is a product of innodb_log_file_size and innodb_log_files_in_group values which should be less than 512 GB. By default, there are two 48 GB redo log files in a group for a total of 96 GB. The size of redo log files can be as large as the buffer pool, not to exceed 512 GB. Also, configuring large redo log files may reduce the checkpoint frequency. The general recommendation is to have enough redo log space available to handle more than an hour of write activity. On the other hand, configuring large log files will result in increased crash recovery time.
Summary
FlashArray with its ease of use, high performance and rich set of data services becomes an excellent storage choice for MySQL database. Designing and configuring database solutions is frequently a compromise between data integrity, simplicity of management and performance. With FlashArray and MySQL, developing a robust, highly performant environment is now achievable.
Appendix A
sysbench 1.0.19 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 96
Initializing random number generator from current time
Initializing worker threads...
Threads started!
SQL statistics:
queries performed:
read: 51309412
write: 53250464
other: 7918156
total: 112478032
transactions: 3958982 (13194.65 per sec.)
queries: 112478032 (374871.13 per sec.)
ignored errors: 17306 (57.68 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 300.0426s
total number of events: 3958982
Latency (ms):
min: 0.42
avg: 7.27
max: 120.65
95th percentile: 16.12
sum: 28791212.81
Threads fairness:
events (avg/stddev): 41239.3958/143.69
execution time (avg/stddev): 299.9085/0.01
Appendix B
# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/8.0/...-defaults.html
[mysqld]
default-authentication-plugin=mysql_native_password
large-pages
datadir=/r3data/mysql
socket=/var/lib/mysql/mysql.sock
bind-address=0.0.0.0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
# general
max_connections=4000
table_open_cache=8000
table_open_cache_instances=16
back_log=1500
default_password_lifetime=0
ssl=0
performance_schema=OFF
max_prepared_stmt_count=512000
skip_log_bin=1
character_set_server=latin1
collation_server=latin1_swedish_ci
skip-character-set-client-handshake
transaction_isolation=REPEATABLE-READ
# files
innodb_file_per_table
innodb_log_file_size=1024M
innodb_log_files_in_group=32
innodb_open_files=4000
innodb_log_compressed_pages=off
# buffers
innodb_buffer_pool_size=32G
innodb_buffer_pool_instances=16
innodb_log_buffer_size=32M
# tune
innodb_doublewrite=0
innodb_thread_concurrency=0
innodb_flush_log_at_trx_commit=2
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10
join_buffer_size=32K
sort_buffer_size=32K
innodb_use_native_aio=1
innodb_stats_persistent=1
innodb_spin_wait_delay=6
innodb_max_purge_lag_delay=300000
innodb_max_purge_lag=0
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_checksum_algorithm=none
innodb_io_capacity=10000
innodb_io_capacity_max=30000
innodb_lru_scan_depth=9000
innodb_change_buffering=none
innodb_read_only=0
innodb_page_cleaners=4
innodb_undo_log_truncate=off
# performance special
innodb_adaptive_flushing=1
innodb_flush_neighbors=0
innodb_read_io_threads=16
innodb_write_io_threads=16
innodb_purge_threads=4
innodb_adaptive_hash_index=0
# monitoring
innodb_monitor_enable='%'
#
innodb_log_compressed_pages=off
© 2019 Pure Storage, the Pure P Logo, and the marks on the Pure Trademark List at https://www.purestorage.com/legal/pr...duserinfo.html are trademarks of Pure Storage, Inc. Other names are trademarks of their respective owners.
THIS DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. PURE STORAGE SHALL NOT BE LIABLE FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES IN CONNECTION WITH THE FURNISHING, PERFORMANCE, OR USE OF THIS DOCUMENTATION. THE INFORMATION CONTAINED IN THIS DOCUMENTATION IS SUBJECT TO CHANGE WITHOUT NOTICE.