Linux服务器之CPU过高解决思路

it2022-05-07  6

CPU负载过高,定位思路如下:

1. 先用top命令找出CPU占比最高的

2. ps -ef 或者jps进一步定位,得知是一个怎样的一个后台程序给我们惹事

3. 定位到具体线程或代码

4. 将需要的线程ID转换为16进制格式(英文小写格式)

5. jstack 进程ID | grep tid(16进程线程ID小写英文) -A60


1. 先用top命令找出CPU占比最高的

top - 09:11:37 up 21 min, 3 users, load average: 0.54, 0.25, 0.16 Tasks: 94 total, 1 running, 93 sleeping, 0 stopped, 0 zombie %Cpu(s): 3.0 us, 6.4 sy, 0.0 ni, 89.3 id, 0.0 wa, 0.0 hi, 1.3 si, 0.0 st KiB Mem : 499428 total, 81452 free, 131984 used, 285992 buff/cache KiB Swap: 1572860 total, 1572860 free, 0 used. 325184 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2485 root 20 0 2024360 25616 12256 S 8.6 5.1 0:08.54 java 2436 root 20 0 154608 5500 4132 S 1.3 1.1 0:01.50 sshd 580 root 20 0 376240 9256 6804 S 0.3 1.9 0:00.20 NetworkManager 1331 root 20 0 0 0 0 S 0.3 0.0 0:00.41 kworker/0:1 1 root 20 0 128036 6604 4144 S 0.0 1.3 0:01.48 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.08 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 6 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kworker/u2:0 7 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root 20 0 0 0 0 S 0.0 0.0 0:00.53 rcu_sched

2. ps -ef 或者jps进一步定位,得知是一个怎样的一个后台程序给我们惹事

[root@node3 ~]# jps -l 2516 sun.tools.jps.Jps 2485 com.wu.pratice.jvm.UnableCreateNewThreadDemo 1413 -- process information unavailable 7162 -- process information unavailable [root@node3 ~]# ps -ef | grep java root 2485 2440 8 09:09 pts/2 00:00:26 java com.wu.pratice.jvm.UnableCreateNewThreadDemo root 2527 2495 0 09:14 pts/1 00:00:00 grep --color=auto java

3. 定位到具体线程或代码

ps -mp 进程 -o THREAD,tid,time -m 显示所有的线程 -p 指定进程id -o 该参数后是用户自定义格式

[root@node3 ~]# ps -mp 2485 -o THREAD,tid,time USER %CPU PRI SCNT WCHAN USER SYSTEM TID TIME root 8.5 - - - - - - 00:00:30 root 0.0 19 - futex_ - - 2485 00:00:00 root 8.3 19 - n_tty_ - - 2486 00:00:29 root 0.0 19 - futex_ - - 2487 00:00:00 root 0.0 19 - futex_ - - 2488 00:00:00 root 0.0 19 - futex_ - - 2489 00:00:00 root 0.0 19 - futex_ - - 2490 00:00:00 root 0.0 19 - futex_ - - 2491 00:00:00 root 0.0 19 - futex_ - - 2492 00:00:00 root 0.0 19 - futex_ - - 2493 00:00:00 root 0.0 19 - futex_ - - 2494 00:00:00

4. 将需要的线程ID转换为16进制格式(英文小写格式)

2486的16进制为9B6

转换方式:printf “%x\n”2486

  注:一定要用英文小写字母,否则监控不到线程代码!!!

5. jstack 进程ID | grep tid(16进程线程ID小写英文) -A60

-A 显示多少行

[root@node3 ~]# jstack 2485 | grep 9b6 -A60 "main" #1 prio=5 os_prio=0 tid=0x00007f165004b800 nid=0x9b6 runnable [0x00007f16590c3000] java.lang.Thread.State: RUNNABLE at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:326) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) - locked <0x00000000fac20580> (a java.io.BufferedOutputStream) at java.io.PrintStream.write(PrintStream.java:482) - locked <0x00000000fac18170> (a java.io.PrintStream) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291) at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104) - locked <0x00000000fac18128> (a java.io.OutputStreamWriter) at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185) at java.io.PrintStream.newLine(PrintStream.java:546) - eliminated <0x00000000fac18170> (a java.io.PrintStream) at java.io.PrintStream.println(PrintStream.java:807) - locked <0x00000000fac18170> (a java.io.PrintStream) at com.wu.pratice.jvm.UnableCreateNewThreadDemo.main(UnableCreateNewThreadDemo.java:24) "VM Thread" os_prio=0 tid=0x00007f16500cb800 nid=0x9b7 runnable "VM Periodic Task Thread" os_prio=0 tid=0x00007f165011a000 nid=0x9be waiting on condition JNI global references: 5

 总结:

1、对于Java应用而言,一下常见的几个性能问题都可以从线程堆栈入手定位:

系统挂起无响应

系统CPU较高

系统运行的响应时间长

线程死锁等

2、想知道线程是在卖力工作还是偷懒休息,这就需要关注线程的运行状态,常用到的几个线程状态有:RUNNABLE,BLOCKED,WAITING,TIMED_WAITING。

RUNNABLE

从虚拟机的角度看,RUNNABLE状态代表线程正处于运行状态。一般情况下处于运行状态线程是会消耗CPU的,但不是所有的RUNNABLE都会消耗CPU,比如线程进行网络IO时,这时线程状态是挂起的,但由于挂起发生在本地代码,虚拟机并不感知,所以不会像显示调用Java的sleep()或者wait()等方法进入WAITING状态,只有等数据到来时才消耗一点CPU.

TIMED_WAITING/WATING

这两种状态表示线程被挂起,等待被唤醒,当设置超时时间时状态为TIMED_WAITING,如果是未设置超时时间,这时的状态为WATING,必须等待lock.notify()或lock.notifyAll()或接收到interrupt信号才能退出等待状态,TIMED_WAITING/WATING下还需要关注下面几个线程状态:

waiting on condition:说明线程等待另一个条件的发生,来把自己唤醒;

on object monitor: 说明该线程正在执行obj.wait()方法,放弃了 Monitor,进入 “Wait Set”队列;

BLOCKED

此时的线程处于阻塞状态,一般是在等待进入一个临界区“waiting for monitor entry”,这种状态是需要重点关注的

3、哪些线程状态占用CPU?

处于TIMED_WAITING、WATING、BLOCKED状态的线程是不消耗CPU的,而处于RUNNABLE状态的线程要结合当前线程代码的性质判断是否消耗CPU:

纯java运算代码,并且未被挂起,是消耗CPU的;

网络IO操作,在等待数据时是不消耗CPU的;


最新回复(0)