Skip to content

CONCEPT Cited by 1 source

Per-core CPU visibility

Per-core CPU visibility is the discipline of watching CPU utilisation core-by-core rather than as a whole-machine aggregate. On hosts with many vCPUs, a single saturated core can cause production-impacting CPU starvation — especially of latency-sensitive kernel threads like network-driver NAPI — while the whole-machine number remains comfortably under-utilised.

The aggregation-bias trap

A 96-vCPU GPU host pinned at 100% on one core represents ~1% whole-machine CPU. Any dashboard averaging across cores will show "machine is idle" even as a critical kernel thread is being starved. (Source: sources/2026-04-15-pinterest-finding-zombies-in-our-systems-cpu-bottlenecks)

Pinterest's 2025 ENA-reset incident investigation stalled for weeks at the aggregate-perf stage because "an overall perf view told us very little about what was happening in each individual core." Breaking out per-core — mpstat -P ALL 1 for a per-second, per-core %sys / %user / %iowait breakdown — immediately surfaced core 39 at 100% %sys for multiple seconds, correlated with the ENA resets, with the rest of the machine quiet.

Canonical triage command

# Per-core utilisation, 1-second cadence, all cores
mpstat -P ALL 1

# Tabular history for offline analysis (Pinterest: 1 hour, 1-second)
mpstat -P ALL 1 3600 > mpstat.log

Columns to scan: %usr, %sys, %iowait. A single core at 100% %sys points at kernel-side consumptionzombie memcg iteration / softirq floods / lock contention. 100% %usr points at userspace workload.

When to reach for it

  • Latency-sensitive kernel thread starvation symptoms — network driver resets (concepts/network-driver-reset), packet drops, missed timer callbacks.
  • Noisy-neighbor hypotheses on shared multi-tenant hosts where one workload is degrading another.
  • Profile triangulation before committing to an expensive temporal-profiling run — per-core visibility tells you which core to profile.

Complement to temporal profiling

Per-core visibility and temporal profiling form a two-step investigation pattern:

  1. Per-core tells you which core has the spike and approximately when.
  2. Temporal profiling (continuous perf record + Flamescope) tells you what stack is running on that core at that moment.

Pinterest applied them in exactly that order — mpstat revealed core 39 saturated at ENA-reset time, then the continuous-perf-record setup caught the kubelet / mem_cgroup_nr_lru_pages stack at the same timestamp.

Seen in

Last updated · 550 distilled / 1,221 read