CPU Utilization is Wrong (2017)
3 days ago
- #System Monitoring
- #Memory I/O
- #CPU Performance
- CPU utilization (%CPU) is misleading as it measures non-idle time, not actual processor busyness.
- Modern CPUs often stall due to memory I/O, making high %CPU not indicative of CPU being the bottleneck.
- Performance Monitoring Counters (PMCs) like IPC (Instructions Per Cycle) provide better insight into CPU performance.
- An IPC < 1.0 suggests memory-bound workloads, while IPC > 1.0 indicates instruction-bound workloads.
- Tools like `perf` and `tiptop` can measure IPC and stalled cycles, offering actionable tuning insights.
- Cloud environments may limit access to PMCs, but some providers like AWS EC2 now support them.
- Performance monitoring tools should include IPC or stalled cycles to avoid misleading users about CPU utilization.
- Other factors like temperature, turboboost, and spin locks also contribute to misleading CPU utilization metrics.
- Renaming %CPU to %CYC (cycles) could better reflect what the metric actually measures.