学堂 学堂 学堂公众号手机端

ANALYZE TABLE 导致的数据库堵塞分析

lewis 5年前 (2020-03-20) 阅读数 6 #技术
MySQL堵塞导火线

一次在处理慢查询时,对表做了分析,ANALYZE TABLE ,之后不久,应用反应数据库查询很慢,通过

show full processlit;

查看 发现与该表相关的大量查询状态均变成了Waiting for table flush。



堵塞原因

从官网信息可以看到出现 Waiting for table flush的原因

The thread is executing​​FLUSH TABLES​​and is waiting for all threads to close their tables, or the thread got a notification that the underlying structure for a table has changed and it needs to reopen the table to get the new structure. However, to reopen the table, it must wait until all other threads have closed the table in question.

This notification takes place if another thread has used​​FLUSH TABLES​​or one of the following statements on the table in question:​​FLUSH TABLEStbl_name​,​​ALTER TABLE​​,​​RENAME TABLE​​,​​REPAIR TABLE​​,​​ANALYZE TABLE​​, or​​OPTIMIZE TABLE​​.


alter table,rename table,repair table,analyze table,optimize table 等DDL语句 需要关闭table,然后重新打开table,

而这些table可能会存在大的事务在执行,或者被锁住了,从而无法关闭table,所以就出现了状态:Waiting for table flush

也就是说:需要执行 flush tables 的线程,因为某些原因无法关闭表,无法完成flush tables,所以就 waiting for table flush.


验证理论
实验验证
session 1
mysql> select sleep(1000) from test.t1;(表中有数据)

session 2
mysql> analyze table test.t1;

session 3
mysql> select * from test.t1;被阻塞

session 4
mysql> show processlist;
+----+------+------------------+------+------------------+--------+---------------------------------------------------------------+---------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+------------------+------+------------------+--------+---------------------------------------------------------------+---------------------------------+
| 5 | repl | 10.0.16.10:48904 | NULL | Binlog Dump GTID | 932031 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 14 | root | localhost | test | Query | 55 | User sleep | select sleep(1000) from test.t1 |
| 15 | root | localhost | NULL | Sleep | 38 | | NULL |
| 16 | root | localhost | NULL | Query | 16 | Waiting for table flush | select * from test.t1 |
| 17 | root | localhost | NULL | Query | 0 | starting | show processlist |
+----+------+------------------+------+------------------+--------+---------------------------------------------------------------+---------------------------------+
5 rows in set (0.00 sec)
进一步分析

造成此问题原因是当ANALYZE TABLE后,线程得到一个通知,表的底层结构已经改变,它需要重新打开表来获得新的结构。但是,要重新打开表,必须等到所有其他线程都关闭了这个表。

此时的session 1长事务,长时间执行,导致无法关闭t1表,无法完成flush tables,所以就 waiting for table flush.

解决办法

解决此问题只有两个方法,要么杀掉所有的长查询或等查询结束 ,要么在低负载下执行。

版权声明

本文仅代表作者观点,不代表博信信息网立场。

热门