Prometheus 查询语言 PromQL 的 CPU 使用率计算方法(Calculation method of CPU utilization of Prometheus query language promql)-其他

Prometheus 查询语言 PromQL 的 CPU 使用率计算方法(Calculation method of CPU utilization of Prometheus query language promql)

————–cpu使用率————–

100 * (1 – sum by (instance)(increase(node_cpu_seconds_total{mode=”idle”}[5m])) / sum by (instance)(increase(node_cpu_seconds_total[5m])))

increase(node_cpu_seconds_total{cpu=”0″,mode=”idle”}[5m])

increase 表示增量

13:00 13:45 13:50启动 13000 此刻13567

[5m] 就表示 13:45

increase 就表示 13:45 ~ 13:50 之间的增量 即 567

cpu0 5分钟内处于空闲状态的时间占比increase(node_cpu_seconds_total{cpu=”0″,mode=”idle”}[5m]) / increase(node_cpu_seconds_total{cpu=”0″}[5m])

sum (increase(node_cpu_seconds_total{mode=”idle”}[5m])) / sum (increase(node_cpu_seconds_total[5m]))

increase(cpu0 idle [5m]) 5分钟的增量 20increase(cpu1 idle [5m]) 5分钟的增量 30increase(cpu2 idle [5m]) 5分钟的增量 40increase(cpu3 idle [5m]) 5分钟的增量 70

sum() 计算总和 20 + 30 + 40 + 70 = 160

increase(cpu0 [5m]) 5分钟的增量 1000increase(cpu1 [5m]) 5分钟的增量 1200increase(cpu2 [5m]) 5分钟的增量 1300increase(cpu3 [5m]) 5分钟的增量 1500

sum() 计算总和 1000 + 1200 + 1300 + 1500 = 5000

160 / 5000 = 3.2% (0.032)

increase(cpu0 instance=”localhost:8080″ [5m]) 5分钟的增量 1000\increase(cpu1 instance=”localhost:8080″ [5m]) 5分钟的增量 1200 |increase(cpu2 instance=”localhost:8080″ [5m]) 5分钟的增量 1300 | 这些分为一组increase(cpu3 instance=”localhost:8080″ [5m]) 5分钟的增量 1500/

increase(cpu0 instance=”localhost:8081″ [5m]) 5分钟的增量 1000\increase(cpu1 instance=”localhost:8081″ [5m]) 5分钟的增量 1200 |increase(cpu2 instance=”localhost:8081″ [5m]) 5分钟的增量 1300 | 这些分为一组increase(cpu3 instance=”localhost:8081″ [5m]) 5分钟的增量 1500/

sum by (instance) ()

————————

————–CPU utilization————–

100 * (1 – sum by (instance)(increase(node_cpu_seconds_total{mode=”idle”}[5m])) / sum by (instance)(increase(node_cpu_seconds_total[5m])))

The CPU utilization of all hosts is displayed in one panel

< strong > here are some notes on the reference document: < / strong >

increase(node_cpu_seconds_total{cpu=”0″,mode=”idle”}[5m])

First, node_ cpu_ seconds_ Total represents the total time of CPU since the system was started. The unit of second {CPU = “0”} represents the first CPU {mode = “idle”} represents that the CPU is in idle state [5m] represents the value 5 minutes ago

Increase means increment

13: 00 13:45 13:50 start 13000 13567

[5m] means 13:45

Increase means the increment between 13:45 and 13:50, i.e. 567

cpu0 5分钟内处于空闲状态的时间占比increase(node_cpu_seconds_total{cpu=”0″,mode=”idle”}[5m]) / increase(node_cpu_seconds_total{cpu=”0″}[5m])

First limit cpu0 and then 5 minutes

For cpu0 In these 5 minutes, the increment in idle state is 20 In these 5 minutes, the total increment (user + sys + idle +…) is 500

Then within these 5 minutes, the percentage is 20 / 500 = 4%

A server may have four CPUs, and the cost of one CPU is calculated

sum (increase(node_cpu_seconds_total{mode=”idle”}[5m])) / sum (increase(node_cpu_seconds_total[5m]))

Increase (cpu0 idle [5m]) 5-minute increment 20 increase (CPU1 idle [5m]) 5-minute increment 30 increase (CPU2 idle [5m]) 5-minute increment 40 increase (cpu3 idle [5m]) 5-minute increment 70

Sum() calculates the sum of 20 + 30 + 40 + 70 = 160

Increase (cpu0 [5m]) 5-minute increment 1000 increase (CPU1 [5m]) 5-minute increment 1200 increase (CPU2 [5m]) 5-minute increment 1300increase (cpu3 [5m]) 5-minute increment 1500

Sum() calculates the sum 1000 + 1200 + 1300 + 1500 = 5000

Percentage of idle state time of a server in total CPU time within 5 minutes

160 / 5000 = 3.2% (0.032)

If you want to monitor multiple hosts

If we write node_ cpu_ seconds_ Total displays all of the following information

increase(cpu0 instance=”localhost:8080″ [5m]) 5分钟的增量 1000\increase(cpu1 instance=”localhost:8080″ [5m]) 5分钟的增量 1200 |increase(cpu2 instance=”localhost:8080″ [5m]) 5分钟的增量 1300 | 这些分为一组increase(cpu3 instance=”localhost:8080″ [5m]) 5分钟的增量 1500/

increase(cpu0 instance=”localhost:8081″ [5m]) 5分钟的增量 1000\increase(cpu1 instance=”localhost:8081″ [5m]) 5分钟的增量 1200 |increase(cpu2 instance=”localhost:8081″ [5m]) 5分钟的增量 1300 | 这些分为一组increase(cpu3 instance=”localhost:8081″ [5m]) 5分钟的增量 1500/

Sum by host group

sum by (instance) ()