Stop using show processlist in a busy MySQL server

Stop using show processlist in a busy MySQL server

MySQL provides 3 ways to list current running threads in the server:

As a DBA you can use any of these methods and they will return about the same details with minor exceptions. But how they get the information is totally different to the extend that they have different impacts on your MySQL server.

Let's go through them one by one to understand how they work and how they impact the server performance:

SHOW PROCESSLIST Statement

It is simply used by executing SHOW PROCESSLIST; or SHOW FULL PROCESSLIST;. The only difference beween the two forms is that the 1st returns only the 1st 100 characters of the query, while the later returns up to max_allowed_packet characters from the running query.

How it works

  1. Lock mutex LOCK_thd_remove; so no any thread can be removed
  2. Lock mutex LOCK_thd_list; so no new thread can be added
  3. Copies the details of all threads
  4. Unlock mutex LOCK_thd_list
  5. Copies the information about each thread that the user has access to into a local array
  6. Unlock mutex LOCK_thd_remove
  7. Send the results to the client

So during the execution of this statement, no thread can be added or removed, which means no new connection can be initialized or terminated 😱. A single execution may not hurt, but continuously executing it can impact the server ability to handle conenctions.

Notes

  • The returned query data with the FULL form depends on the value of variable max_allowed_packet which has a default value of 64MB. So a lot if data can be returned if the application is inserting or updating a big LOB or TEXT.

Table information_schema.processlist

By running select * from information_schema.processlist; you'll get the same data returned by SHOW FULL PROCESSLIST; with one exception, the maximum length of the query returned is 64KB.

This is not a real table, the table data is generated each time the table is selected from.

How it works

  1. Lock mutex LOCK_thd_remove; so no any thread can be removed
  2. Lock mutex LOCK_thd_list; so no new thread can be added
  3. Copies the details of all threads
  4. Unlock mutex LOCK_thd_list
  5. Copies the information about each thread that the user has access to into a temporary table
  6. Unlock mutex LOCK_thd_remove
  7. Run the select statement on the temporary table

This is the same as SHOW PROCESSLIST Statement except that the data is stored 1st in a temporary table, and then the temporary table is queried normally as any other real table.

Here is one catch, the temporary table is not in-memory table, it is an in-desk table. The size of info field is too big to be stored in MEMORY table. This means that the information is 1st stored physically in the desk before the select statement is executed.

So this way of retreiving threads information has the compound impact of:

  • The same impact on new connections as SHOW PROCESSLIST Statement
  • Extra disk IO
  • The impact on closing connections is much increased as they have to wait for the table to be persisted, and this can be long time in case of contention in IO and innodb buffer pool (assuming you have default setting of internal_tmp_disk_storage_engine=INNODB).

Notes

  • The returned data in info column has a maximum length of 64KB.

Table performance_Schema.threads

Table performance_Schema.threads has the same columns as SHOW PROCESSLIST; and information_schema.processlist in addition to few others. But the shared columns have a prefix processlist_ to the column name. So the query can be read for column processlist_info instead of info.

This is a real in-memory table, it is different than the previous two methods in that it has the information always available and waiting for you to query it.

When a new thread is created, a new row is added to the table, and when a thread is terminated a row is deleted from the threads table.

How does it work? you just query it! No IO or global mutex, it is a query for an in-memory table: select * from performance_schema.threads;

There is a very small impact during the server operation by continuously updating this table and others in performance schema, but non is blocking, and in my experience the impact is negligible.

Notes

  • performance_Schema.threads has information about background threads also, so the number of returned rows is more than those returned by the other two methods. You can filter-out background thread by select * from performance_schema.threads where TYPE='FOREGROUND';.
  • Any user with select access to this table can access the information of all threads.
  • Performance schema must be enabled for this table to work.
  • The returned query is truncated by default after 1024 characters. You can change the stored query length by setting system variable performance_schema_max_sql_text_length.

Conclusion

  • Querying table informations_schema.processlist has the largest impact on the DB server peroformance.
  • If not done already, work on changing your monitoring scripts to use performance_schema instead of information_schema and show processlist, this should reduce the impact of your monitoring queries on the server.