CONCEPT Cited by 1 source
CPU utilization vs saturation¶
Definition¶
Utilisation and saturation are two separate measurements of the same CPU. They can (and frequently do) diverge, and conflating them is one of the most common triage mistakes.
- Utilisation = the fraction of wall-clock time a CPU was
busy servicing work (any non-idle state). On Linux:
us + sy + ni + hi + si + st + guest + gniceacross/proc/stat. - Saturation = the degree to which more work is demanded
than the CPU can service, surfaced as queue depth or wait
time. On Linux:
vmstat'srcolumn (count of tasks running on CPU + waiting to run) or run queue latency (time tasks spend inTASK_RUNNINGbefore dispatch).
Why both matter¶
Four corner cases make the distinction load-bearing:
- High utilisation, low saturation. CPU 99% busy,
r≤ CPU count. Work arrives at the rate the CPU can service it — high throughput, stable latency. Usually fine. - High utilisation, high saturation. CPU 99% busy,
r≫ CPU count. Work arrives faster than it can be serviced; queues grow; tail latency blows up. The classic CPU- bottleneck shape. - Low utilisation, high saturation. Uncommon but diagnostic.
CPU is idle because tasks are blocked on something else
(locks, I/O, cgroup CFS throttling) — they're runnable but
not running. Pair this with
%iowaitor cgroup throttling counters. - Low utilisation, low saturation. Healthy idle or under- subscribed host.
Netflix's framing¶
From Brendan Gregg's 60-second checklist:
r : Number of processes running on CPU and waiting for a turn. This provides a better signal than load averages for determining CPU saturation, as it does not include I/O. To interpret: an "r" value greater than the CPU count is saturation.
And on utilisation:
The CPU time breakdowns will confirm if the CPUs are busy, by adding user + system time.
Two different measurements on the same vmstat line.
Example: the "99% CPU, queued" shape¶
The post's worked example:
r b swpd free buff cache si so ... us sy id wa st
34 0 0 200889792 73708 591828 0 0 ... 96 1 3 0 0
32 0 0 200889920 73708 591860 0 0 ... 98 1 1 0 0
r = 32-34 on a 32-CPU host, us ≈ 98, sy ≈ 1. CPU is near-
100% utilised and saturated. This is not a single over-busy
CPU — it's a persistently-deeper-than-CPU-count run queue.
mpstat confirms no single CPU is hotter than
the others.
When saturation is the better signal¶
Load average can be ambiguous (mixes CPU and I/O).
concepts/run-queue-latency via eBPF gives the cleanest
scheduler-layer view. vmstat's r is a middle ground — more
specific than load average, less deep than run-queue latency —
and available on every stock Linux host without extra tooling.
Seen in¶
- sources/2025-07-29-netflix-linux-performance-analysis-in-60-seconds
— canonical statement of the distinction on the wiki, via the
vmstat 1rcolumn explanation. Paired withus / sy / id / wa / stas the utilisation measurement on the same output line.
Related¶
- concepts/use-method — the framework that makes the distinction load-bearing.
- concepts/load-average — a poorer demand signal because it mixes CPU + I/O.
- concepts/run-queue-latency — the deeper saturation primitive.
- concepts/cpu-time-breakdown — the utilisation decomposition.
- systems/vmstat · systems/mpstat