服务器CPU100%的排查日志(Server CPU100% troubleshooting log)

早上发现访问服务器非常慢,赶紧登录控制台,原来CPU正在满负荷运行。

一、SSH到服务器,先使用top命令确定一下当前的服务器负载:

top

发现 mysql 的 CPU 占用已经达到了 375%(因为是4核心),那就初步确定是 mysql 的问题了。

二、看看哪些SQL造成的CPU占用过高

执行一条语句,用来检索当前执行时间最长的sql:

SELECT * FROM information_schema.PROCESSLIST 
WHERE command != 'Sleep' 
ORDER BY time DESC LIMIT 20;

排到上面的都是占用时间比较长的,发现都是几个数据量和并发都比较大的表。

处理方式也很简单,直接加索引就好,过一会CPU负载就会慢慢降下来。

三、停止个别超时或挂起的SQL线程

添加索引后,一般情况下很多的慢SQL都会有所改善,但如果还有个别SQL一直在假死状态,就需要手动kill掉才能回收服务器资源。

执行以下SQL,拼接出需要手动kill的命令:

SELECT concat('kill ', id, ';') FROM information_schema.PROCESSLIST 
WHERE command != 'Sleep' AND time > 200
ORDER BY time DESC;

复制并执行生成的kill命令,过一会CPU就会缓过劲来了。 

————————

In the morning, I found that accessing the server was very slow, so I quickly logged in to the console. It turned out that the CPU was running at full load.

< strong > 1. SSH to the server. First use the top command to determine the current server load: < / strong >

top

It is found that the CPU usage of MySQL has reached 375% (because it is a 4-core), so it is preliminarily determined that it is the problem of MySQL.

< strong > Second, look at which SQL causes excessive CPU usage < / strong >

Execute a statement to retrieve the SQL with the longest execution time:

SELECT * FROM information_schema.PROCESSLIST 
WHERE command != 'Sleep' 
ORDER BY time DESC LIMIT 20;

All the tables above take a long time. It is found that they are several tables with large data volume and concurrency.

The processing method is also very simple. It’s good to quote directly. After a while, the CPU load will slowly drop.

< strong > III. stop individual timed out or suspended SQL threads < / strong >

After adding the index, many slow SQL will be improved in general, but if there are still some SQL in the suspended state, you need to manually kill it to recover the server resources.

Execute the following SQL to splice the commands that require manual kill:

SELECT concat('kill ', id, ';') FROM information_schema.PROCESSLIST 
WHERE command != 'Sleep' AND time > 200
ORDER BY time DESC;

Copy and execute the generated Kill Command, and the CPU will slow down after a while.