8.5.1 Optimizing Storage Layout for InnoDB Tables
Once your data reaches a stable size, or a growing table has increased by tens or some hundreds of megabytes, consider using the
OPTIMIZE TABLEstatement to reorganize the table and compact any wasted space. The reorganized tables require less disk I/O to perform full table scans. This is a straightforward technique that can improve performance when other techniques such as improving index usage or tuning application code are not practical.
OPTIMIZE TABLEcopies the data part of the table and rebuilds the indexes. The benefits come from improved packing of data within indexes, and reduced fragmentation within the tablespaces and on disk. The benefits vary depending on the data in each table. You may find that there are significant gains for some and not for others, or that the gains decrease over time until you next optimize the table. This operation can be slow if the table is large or if the indexes being rebuilt do not fit into the buffer pool. The first run after adding a lot of data to a table is often much slower than later runs.
InnoDB, having a long
PRIMARY KEY(either a single column with a lengthy value, or several columns that form a long composite value) wastes a lot of disk space. The primary key value for a row is duplicated in all the secondary index records that point to the same row. (See Section 18.104.22.168, “Clustered and Secondary Indexes”.) Create an
AUTO_INCREMENTcolumn as the primary key if your primary key is long, or index a prefix of a long
VARCHARcolumn instead of the entire column.
VARCHARdata type instead of
CHARto store variable-length strings or for columns with many
CHAR(column always takes
Ncharacters to store data, even if the string is shorter or its value is
NULL. Smaller tables fit better in the buffer pool and reduce disk I/O.
COMPACTrow format (the default
InnoDBformat) and variable-length character sets, such as
CHAR(columns occupy a variable amount of space, but still at least
For tables that are big, or contain lots of repetitive text or numeric data, consider using
COMPRESSEDrow format. Less disk I/O is required to bring data into the buffer pool, or to perform full table scans. Before making a permanent decision, measure the amount of compression you can achieve by using