We have a few nodes inside LXC on Proxmox.
They get stuck on "High CPU utilization (100.000000%)" when CPU load hit that, but does not leave 100% before shutdown LXC and start again.
Looking at load avg, it isnt having much work todo.
How does NetXMS agent measure the CPU load?
16 CPU cores 32 Threads
@proxmox:~$ w
07:39:43 up 15 days, 9:28, 1 user, load average: 0.59, 0.64, 0.72
1 vCPU
@lxc01:~$ w
07:36:43 up 1 day, 12:02, 1 user, load average: 0.64, 0.73, 0.77
1 vCPU
@lxc02:~$ w
07:38:35 up 1 day, 13:18, 2 users, load average: 0.59, 0.64, 0.73
Looks interesting. Does it happens for all LXC containers on Proxmox or only for some of them? What is Proxmox version and what are the systems that are insde the containers?
Latest proxmox 6.1 and up2date with apt.
Containers are Ubuntu 18.04.4 also up2date with apt.
Does not happen with all lxc in the cluster.
Had 2 out of 3 on same proxmox node having this when reporting.
It happens randomly, but it seems to hang after the containers have had a big load.
NetXMS takes CPU load information from /proc/stat. We just read this file once per second, calculate deltas and divide the deltas by sum of them.
Please run below command on a container that got into that state and share the output:
n=1; while [ $n -le 60 ]; do cat /proc/stat | grep cpu0; sleep 1; n=$((n+1)); done
Then restart the container and get these stats again so we could have some data for comparison.