8.9.5 The Optimizer Cost Model

To generate execution plans, the optimizer uses a cost model that is based on estimates of the cost of various operations that occur during query execution. The optimizer has a set of compiled-in default cost constants available to it to make decisions regarding execution plans.

The optimizer also has a database of cost estimates to use during execution plan construction. These estimates are stored in the server_cost and engine_cost tables in the mysql system database and are configurable at any time. The intent of these tables is to make it possible to easily adjust the cost estimates that the optimizer uses when it attempts to arrive at query execution plans.

Cost Model General Operation

The configurable optimizer cost model works like this:

  • The server reads the cost model tables into memory at startup and uses the in-memory values at runtime. Any non-NULL cost estimate specified in the tables takes precedence over the corresponding compiled-in default cost constant. Any NULL estimate indicates to the optimizer to use the compiled-in default.

  • At runtime, the server may reread the cost tables. This occurs when a storage engine is dynamically loaded or when a FLUSH OPTIMIZER_COSTS statement is executed.

  • Cost tables enable server administrators to easily adjust cost estimates by changing entries in the tables. It is also easy to revert to a default by setting an entry's cost to NULL. The optimizer uses the in-memory cost values, so changes to the tables should be followed by FLUSH OPTIMIZER_COSTS to take effect.

  • The in-memory cost estimates that are current when a client session begins apply throughout that session until it ends. In particular, if the server rereads the cost tables, any changed estimates apply only to subsequently started sessions. Existing sessions are unaffected.

  • Cost tables are specific to a given server instance. The server does not replicate cost table changes to replicas.

The Cost Model Database

The optimizer cost model database consists of two tables in the mysql system database that contain cost estimate information for operations that occur during query execution:

  • server_cost: Optimizer cost estimates for general server operations

  • engine_cost: Optimizer cost estimates for operations specific to particular storage engines

The server_cost table contains these columns:

  • cost_name

    The name of a cost estimate used in the cost model. The name is not case-sensitive. If the server does not recognize the cost name when it reads this table, it writes a warning to the error log.

  • cost_value

    The cost estimate value. If the value is non-NULL, the server uses it as the cost. Otherwise, it uses the default estimate (the compiled-in value). DBAs can change a cost estimate by updating this column. If the server finds that the cost value is invalid (nonpositive) when it reads this table, it writes a warning to the error log.

    To override a default cost estimate (for an entry that specifies NULL), set the cost to a non-NULL value. To revert to the default, set the value to NULL. Then execute FLUSH OPTIMIZER_COSTS to tell the server to reread the cost tables.

  • last_update

    The time of the last row update.

  • comment

    A descriptive comment associated with the cost estimate. DBAs can use this column to provide information about why a cost estimate row stores a particular value.

The primary key for the server_cost table is the cost_name column, so it is not possible to create multiple entries for any cost estimate.

The server recognizes these cost_name values for the server_cost table:

  • disk_temptable_create_cost (default 40.0), disk_temptable_row_cost (default 1.0)

    The cost estimates for internally created temporary tables stored in a disk-based storage engine (either InnoDB or MyISAM). Increasing these values increases the cost estimate of using internal temporary tables and makes the optimizer prefer query plans with less use of them. For information about such tables, see Section 8.4.4, “Internal Temporary Table Use in MySQL”.

    The larger default values for these disk parameters compared to the default values for the corresponding memory parameters (memory_temptable_create_cost, memory_temptable_row_cost) reflects the greater cost of processing disk-based tables.

  • key_compare_cost (default 0.1)

    The cost of comparing record keys. Increasing this value causes a query plan that compares many keys to become more expensive. For example, a query plan that performs a filesort becomes relatively more expensive compared to a query plan that avoids sorting by using an index.

  • memory_temptable_create_cost (default 2.0), memory_temptable_row_cost (default 0.2)

    The cost estimates for internally created temporary tables stored in the MEMORY storage engine. Increasing these values increases the cost estimate of using internal temporary tables and makes the optimizer prefer query plans with less use of them. For information about such tables, see Section 8.4.4, “Internal Temporary Table Use in MySQL”.

    The smaller default values for these memory parameters compared to the default values for the corresponding disk parameters (disk_temptable_create_cost, disk_temptable_row_cost) reflects the lesser cost of processing memory-based tables.

  • row_evaluate_cost (default 0.2)

    The cost of evaluating record conditions. Increasing this value causes a query plan that examines many rows to become more expensive compared to a query plan that examines fewer rows. For example, a table scan becomes relatively more expensive compared to a range scan that reads fewer rows.

The engine_cost table contains these columns:

  • engine_name

    The name of the storage engine to which this cost estimate applies. The name is not case-sensitive. If the value is default, it applies to all storage engines that have no named entry of their own. If the server does not recognize the engine name when it reads this table, it writes a warning to the error log.

  • device_type

    The device type to which this cost estimate applies. The column is intended for specifying different cost estimates for different storage device types, such as hard disk drives versus solid state drives. Currently, this information is not used and 0 is the only permitted value.

  • cost_name

    Same as in the server_cost table.

  • cost_value

    Same as in the server_cost table.

  • last_update

    Same as in the server_cost table.

  • comment

    Same as in the server_cost table.

The primary key for the engine_cost table is a tuple comprising the (cost_name, engine_name, device_type) columns, so it is not possible to create multiple entries for any combination of values in those columns.

The server recognizes these cost_name values for the engine_cost table:

  • io_block_read_cost (default 1.0)

    The cost of reading an index or data block from disk. Increasing this value causes a query plan that reads many disk blocks to become more expensive compared to a query plan that reads fewer disk blocks. For example, a table scan becomes relatively more expensive compared to a range scan that reads fewer blocks.

  • memory_block_read_cost (default 1.0)

    Similar to io_block_read_cost, but represents the cost of reading an index or data block from an in-memory database buffer.

If the io_block_read_cost and memory_block_read_cost values differ, the execution plan may change between two runs of the same query. Suppose that the cost for memory access is less than the cost for disk access. In that case, at server startup before data has been read into the buffer pool, you may get a different plan than after the query has been run because then the data will be in memory.

Making Changes to the Cost Model Database

For DBAs who wish to change the cost model parameters from their defaults, try doubling or halving the value and measuring the effect.

Changes to the io_block_read_cost and memory_block_read_cost parameters are most likely to yield worthwhile results. These parameter values enable cost models for data access methods to take into account the costs of reading information from different sources; that is, the cost of reading information from disk versus reading information already in a memory buffer. For example, all other things being equal, setting io_block_read_cost to a value larger than memory_block_read_cost causes the optimizer to prefer query plans that read information already held in memory to plans that must read from disk.

This example shows how to change the default value for io_block_read_cost:

UPDATE mysql.engine_cost
  SET cost_value = 2.0
  WHERE cost_name = 'io_block_read_cost';
FLUSH OPTIMIZER_COSTS;

This example shows how to change the value of io_block_read_cost only for the InnoDB storage engine:

INSERT INTO mysql.engine_cost
  VALUES ('InnoDB', 0, 'io_block_read_cost', 3.0,
  CURRENT_TIMESTAMP, 'Using a slower disk for InnoDB');
FLUSH OPTIMIZER_COSTS;