PATTERN Cited by 1 source
Continuous perf-record for time-travel¶
Loop perf record in fixed-size timestamped windows for hours or
days, so that when a rare event fires you can load just the window
that captured it. The ad-hoc bash implementation of
temporal profiling when you don't
have a continuous-profiling platform rolled out.
Canonical recipe¶
The pattern that broke Pinterest's 3-month ENA-reset incident (Source: sources/2026-04-15-pinterest-finding-zombies-in-our-systems-cpu-bottlenecks):
# Loop: 360 iterations * 2 minutes ≈ 12 hours coverage
# Timestamped filenames enable post-hoc event alignment
for i in {1..360}; do
sudo perf record \
-F 97 \
-g \
-a \
-o perf-$(hostname)-$(date +"%Y%m%d-%H-%M-%S")-120s.data \
-- sleep 120
done
# Post-hoc: generate stack text for each window
for datafile in $(ls perf-*); do
perf script --header -i $datafile > $datafile.stacks
done
Then load the specific window's .stacks file into
Flamescope (or stackcollapse-perf.pl |
flamegraph.pl for a plain flamegraph) and zoom to the seconds
around the event.
Knob-by-knob rationale¶
-F 97(97 Hz sampling). Prime number near 100 Hz to avoid aliasing with periodic workloads. Low enough to keep overhead tolerable on multi-vCPU hosts; high enough to catch sub-second spikes.-g(call-graph). Without stack traces, flamegraphs are useless. Required.-a(all CPUs). You don't know in advance which core will have the spike; profile the whole machine.- 2-minute windows. Chosen to cap individual
perf.datafile sizes — a 12-hour single file would be unwieldy and would also force you to reprocess all of it for each replay. 2 minutes is small enough to replay quickly, large enough that a single event lands comfortably inside one window. - Hostname + timestamp in filename. Required for fleet-wide deployment and for correlation with kernel-log timestamps.
- 12-hour run. Tuned to Pinterest's typical 8-12 h training job length — you want the incubation window for the rare event to fit inside the profile horizon.
Matching events to windows¶
The time-travel step:
- Find the event timestamp. Pinterest got it from
dmesgENA reset lines; other examples would be alert firing time, OOM kill time, TCP-reset log time. - Pick the window. Filename timestamp gives you the start; the event time tells you how many seconds in. Pinterest saw the reset at "about 70 seconds into this profile" and zoomed Flamescope to a 5 s sub-window around it.
- Load and zoom. Flamescope's heatmap UI + drag-to-select is purpose-built for this.
Storage footprint warning¶
Each 2-minute perf record -F 97 -g -a data file on a 96-vCPU host
is typically a few hundred MB. 12 hours × 360 files × multiple hosts
fills a disk fast. This pattern is viable on a reserved debug
fleet (patterns/reserved-host-repro-env) with a dedicated
volume for profile data, not fleet-wide.
If you want fleet-wide continuous profiling, the production-grade replacements are Parca / Pyroscope / gProfiler / Strobelight — they symbolise and deduplicate data server-side so the per-host footprint is manageable. Pinterest was rolling out gProfiler with Intel concurrently for exactly this reason; the bash loop was the ad-hoc bridge.
Failure modes¶
- Overhead matters on hot hosts.
-F 997or-F 999would give finer resolution but also substantially higher CPU overhead — could mask the starvation signal you're hunting. - Kernel symbols must be available. Missing
/proc/kallsymsor stripped kernel builds give you hex addresses instead of function names in the flamegraph. Validateperf scriptoutput early. - Disk fills. Watch free space; add
find . -name 'perf-*.data' -mmin +720 -deletein parallel if the investigation runs longer than expected.
Seen in¶
- sources/2026-04-15-pinterest-finding-zombies-in-our-systems-cpu-bottlenecks
— canonical production use. The exact bash loop above ran on
Pinterest's reserved debug fleet for one night; post-hoc time-
travel to the window around an ENA reset revealed kubelet
burning 6.5% of total CPU on
mem_cgroup_nr_lru_pages— the datum that ended a 3-month investigation.
Related¶
- concepts/temporal-profiling — the concept this pattern instantiates
- patterns/reserved-host-repro-env — natural companion
- systems/linux-perf — the sampler
- systems/flamescope — the time-travel visualiser