Skip to content

CONCEPT Cited by 1 source

CPU time breakdown

Definition

Linux accounts CPU time across a fixed set of named buckets in /proc/stat, surfaced on vmstat / mpstat / top / sar as columns (us / sy / id / wa / st / hi / si / ni / guest / gnice). Decomposing total CPU time into these buckets is the primary way to answer "the CPU is busy — doing what?"

Column Meaning
us / %usr Time in user-space application code
sy / %sys Time in kernel-space (syscalls, kernel threads)
ni / %nice Nice-adjusted user-space time
id / %idle Truly idle (no runnable task, nothing pending)
wa / %iowait Idle because tasks are blocked on disk I/O
st / %steal Cycles stolen by the hypervisor (EC2/Xen/KVM)
hi / %irq Hardware interrupt servicing
si / %soft Softirq servicing (networking, timers)
guest Running a guest VM (host-side view)
gnice Nice-adjusted guest time

The five most useful columns in production

us + sy — total CPU busy. If this is near 100%, the CPU is fully utilised (but not necessarily saturated).

us — application-layer work. High us with low sy on a non-idle host is the "application is doing its job" shape; the question pivots to whether the application should be doing that much work or whether it's looping / over-allocating / hot-spinning.

sy — kernel-layer work. Threshold from Netflix: "A high system time average, over 20%, can be interesting to explore further: perhaps the kernel is processing the I/O inefficiently." Common causes: small-read-ahead mmap, extensive syscall churn, naïve JSON/parse loops hitting slab allocator, misconfigured I/O scheduler.

wa / %iowait — see concepts/io-wait. Really a form of idle — the CPU isn't busy — but it's "idle because disk is slow," pointing to disk follow-up with iostat.

st / %steal — time the hypervisor scheduled another guest (or the host's own driver domain under Xen) on your vCPU. Non- zero %steal is the in-guest signature of co-tenancy on a shared physical host — EC2 / Azure / other hypervisors will show this when the VM can't get its fair share. A sustained %steal > 0 on a performance-sensitive instance is a signal to consider a dedicated tenancy, larger instance size, or cloud-provider capacity investigation.

Reading the breakdown as a shape

Common production shapes:

  • us ≈ 98, sy ≈ 1, wa ≈ 0, st ≈ 0 — application-CPU-bound. Look at pidstat / flame graphs.
  • us ≈ 40, sy ≈ 35, wa ≈ 0, st ≈ 0 — kernel-heavy. Look at syscall rate, softirq counters, possibly NUMA misalignment or inefficient I/O.
  • us ≈ 20, sy ≈ 5, wa ≈ 60, st ≈ 0 — disk-bottlenecked. Pivot to iostat -xz 1.
  • us ≈ 50, sy ≈ 5, wa ≈ 0, st ≈ 30 — hypervisor-stolen cycles. Cloud-side investigation or instance-family upgrade.
  • us ≈ 10, sy ≈ 5, id ≈ 80, wa ≈ 0 — idle or under-subscribed.

Where it's shown

  • vmstat 1 — single system-wide line per interval.
  • mpstat -P ALL 1 — per-CPU breakdown; exposes single-hot-CPU patterns that system-wide averages hide.
  • top — per-process and system-wide.
  • sar — historical archival (reads from sadc-collected data in /var/log/sa/).

Seen in

Last updated · 319 distilled / 1,201 read