25.19 Using the Performance Schema to Diagnose Problems

The Performance Schema is a tool to help a DBA do performance tuning by taking real measurements instead of wild guesses. This section demonstrates some ways to use the Performance Schema for this purpose. The discussion here relies on the use of event filtering, which is described in Section 25.4.2, “Performance Schema Event Filtering”.

The following example provides one methodology that you can use to analyze a repeatable problem, such as investigating a performance bottleneck. To begin, you should have a repeatable use case where performance is deemed too slow and needs optimization, and you should enable all instrumentation (no pre-filtering at all).

  1. Run the use case.

  2. Using the Performance Schema tables, analyze the root cause of the performance problem. This analysis will rely heavily on post-filtering.

  3. For problem areas that are ruled out, disable the corresponding instruments. For example, if analysis shows that the issue is not related to file I/O in a particular storage engine, disable the file I/O instruments for that engine. Then truncate the history and summary tables to remove previously collected events.

  4. Repeat the process at step 1.

    At each iteration, the Performance Schema output, particularly the events_waits_history_long table, will contain less and less noise caused by nonsignificant instruments, and given that this table has a fixed size, will contain more and more data relevant to the analysis of the problem at hand.

    At each iteration, investigation should lead closer and closer to the root cause of the problem, as the signal/noise ratio will improve, making analysis easier.

  5. Once a root cause of performance bottleneck is identified, take the appropriate corrective action, such as:

    • Tune the server parameters (cache sizes, memory, and so forth).

    • Tune a query by writing it differently,

    • Tune the database schema (tables, indexes, and so forth).

    • Tune the code (this applies to storage engine or server developers only).

  6. Start again at step 1, to see the effects of the changes on performance.

The mutex_instances.LOCKED_BY_THREAD_ID and rwlock_instances.WRITE_LOCKED_BY_THREAD_ID columns are extremely important for investigating performance bottlenecks or deadlocks. This is made possible by Performance Schema instrumentation as follows:

  1. Suppose that thread 1 is stuck waiting for a mutex.

  2. You can determine what the thread is waiting for:

    SELECT * FROM performance_schema.events_waits_current
    WHERE THREAD_ID = thread_1;

    Say the query result identifies that the thread is waiting for mutex A, found in events_waits_current.OBJECT_INSTANCE_BEGIN.

  3. You can determine which thread is holding mutex A:

    SELECT * FROM performance_schema.mutex_instances
    WHERE OBJECT_INSTANCE_BEGIN = mutex_A;

    Say the query result identifies that it is thread 2 holding mutex A, as found in mutex_instances.LOCKED_BY_THREAD_ID.

  4. You can see what thread 2 is doing:

    SELECT * FROM performance_schema.events_waits_current
    WHERE THREAD_ID = thread_2;