8.5.8 Optimizing InnoDB Disk I/O
If you follow best practices for database design and tuning techniques for SQL operations, but your database is still slow due to heavy disk I/O activity, consider these disk I/O optimizations. If the Unix
top tool or the Windows Task Manager shows that the CPU usage percentage with your workload is less than 70%, your workload is probably disk-bound.
Increase buffer pool size
When table data is cached in the
InnoDBbuffer pool, it can be accessed repeatedly by queries without requiring any disk I/O. Specify the size of the buffer pool with the
innodb_buffer_pool_sizeoption. This memory area is important enough that it is typically recommended that
innodb_buffer_pool_sizeis configured to 50 to 75 percent of system memory. For more information see, Section 18.104.22.168, “How MySQL Uses Memory”.
Adjust the flush method
In some versions of GNU/Linux and Unix, flushing files to disk with the Unix
InnoDBuses by default) and similar methods is surprisingly slow. If database write performance is an issue, conduct benchmarks with the
innodb_flush_methodparameter set to
Use a noop or deadline I/O scheduler with native AIO on Linux
InnoDBuses the asynchronous I/O subsystem (native AIO) on Linux to perform read-ahead and write requests for data file pages. This behavior is controlled by the
innodb_use_native_aioconfiguration option, which is enabled by default. With native AIO, the type of I/O scheduler has greater influence on I/O performance. Generally, noop and deadline I/O schedulers are recommended. Conduct benchmarks to determine which I/O scheduler provides the best results for your workload and environment. For more information, see Section 14.8.7, “Using Asynchronous I/O on Linux”.
Use direct I/O on Solaris 10 for x86_64 architecture
When using the
InnoDBstorage engine on Solaris 10 for x86_64 architecture (AMD Opteron), use direct I/O for
InnoDB-related files to avoid degradation of
InnoDBperformance. To use direct I/O for an entire UFS file system used for storing
InnoDB-related files, mount it with the
mount_ufs(1M). (The default on Solaris 10/x86_64 is not to use this option.) To apply direct I/O only to
InnoDBfile operations rather than the whole file system, set
innodb_flush_method = O_DIRECT. With this setting,
fcntl()for I/O to data files (not for I/O to log files).
Use raw storage for data and log files with Solaris 2.6 or later
When using the
InnoDBstorage engine with a large
innodb_buffer_pool_sizevalue on any release of Solaris 2.6 and up and any platform (sparc/x86/x64/amd64), conduct benchmarks with
InnoDBdata files and log files on raw devices or on a separate direct I/O UFS file system, using the
forcedirectiomount option as described previously. (It is necessary to use the mount option rather than setting
innodb_flush_methodif you want direct I/O for the log files.) Users of the Veritas file system VxFS should use the
Do not place other MySQL data files, such as those for
MyISAMtables, on a direct I/O file system. Executables or libraries must not be placed on a direct I/O file system.
Use additional storage devices
Additional storage devices could be used to set up a RAID configuration. For related information, see Section 8.12.2, “Optimizing Disk I/O”.
InnoDBtablespace data files and log files can be placed on different physical disks. For more information, refer to the following sections:
Consider non-rotational storage
Non-rotational storage generally provides better performance for random I/O operations; and rotational storage for sequential I/O operations. When distributing data and log files across rotational and non-rotational storage devices, consider the type of I/O operations that are predominantly performed on each file.
Random I/O-oriented files typically include file-per-table and general tablespace data files, undo tablespace files, and temporary tablespace files. Sequential I/O-oriented files include
InnoDBsystem tablespace files (due to doublewrite buffering and change buffering) and log files such as binary log files and redo log files.
Review settings for the following configuration options when using non-rotational storage:
crc32option uses a faster checksum algorithm and is recommended for fast storage systems.
Optimizes I/O for rotational storage devices. Disable it for non-rotational storage or a mix of rotational and non-rotational storage.
The default setting of 200 is generally sufficient for a lower-end non-rotational storage device. For higher-end, bus-attached devices, consider a higher setting such as 1000.
The default value of 2000 is intended for workloads that use non-rotational storage. For a high-end, bus-attached non-rotational storage device, consider a higher setting such as 2500.
If redo logs are on non-rotational storage, consider disabling this option to reduce logging. See Disable logging of compressed pages.
If redo logs are on non-rotational storage, configure this option to maximize caching and write combining.
Consider using a page size that matches the internal sector size of the disk. Early-generation SSD devices often have a 4KB sector size. Some newer devices have a 16KB sector size. The default
InnoDBpage size is 16KB. Keeping the page size close to the storage device block size minimizes the amount of unchanged data that is rewritten to disk.
If binary logs are on non-rotational storage and all tables have primary keys, consider setting this option to
minimalto reduce logging.
Ensure that TRIM support is enabled for your operating system. It is typically enabled by default.
Increase I/O capacity to avoid backlogs
If throughput drops periodically because of
InnoDBcheckpoint operations, consider increasing the value of the
innodb_io_capacityconfiguration option. Higher values cause more frequent flushing, avoiding the backlog of work that can cause dips in throughput.
Lower I/O capacity if flushing does not fall behind
If the system is not falling behind with
InnoDBflushing operations, consider lowering the value of the
innodb_io_capacityconfiguration option. Typically, you keep this option value as low as practical, but not so low that it causes periodic drops in throughput as mentioned in the preceding bullet. In a typical scenario where you could lower the option value, you might see a combination like this in the output from
SHOW ENGINE INNODB STATUS:
History list length low, below a few thousand.
Insert buffer merges close to rows inserted.
Modified pages in buffer pool consistently well below
innodb_max_dirty_pages_pctof the buffer pool. (Measure at a time when the server is not doing bulk inserts; it is normal during bulk inserts for the modified pages percentage to rise significantly.)
Log sequence number - Last checkpointis at less than 7/8 or ideally less than 6/8 of the total size of the
Store system tablespace files on Fusion-io devices
You can take advantage of a doublewrite buffer-related I/O optimization by storing system tablespace files (“ibdata files”) on Fusion-io devices that support atomic writes. In this case, doublewrite buffering (
innodb_doublewrite) is automatically disabled and Fusion-io atomic writes are used for all data files. This feature is only supported on Fusion-io hardware and is only enabled for Fusion-io NVMFS on Linux. To take full advantage of this feature, an
Because the doublewrite buffer setting is global, doublewrite buffering is also disabled for data files residing on non-Fusion-io hardware.
When using the
InnoDBtable compression feature, images of re-compressed pages are written to the redo log when changes are made to compressed data. This behavior is controlled by
innodb_log_compressed_pages, which is enabled by default to prevent corruption that can occur if a different version of the
zlibcompression algorithm is used during recovery. If you are certain that the
zlibversion will not change, disable
innodb_log_compressed_pagesto reduce redo log generation for workloads that modify compressed data.