Skip to content

PATTERN Cited by 1 source

Periodic sampling memory profiler

Periodic sampling memory profiler is the pattern of repeatedly polling an instantaneous-state counter table at a fixed interval and retaining a sliding window of samples to reconstruct memory behaviour over time.

(Source: sources/2026-04-21-planetscale-profiling-memory-usage-in-mysql.)

Motivation

Memory counter tables like MySQL's memory_summary_* expose current state, not history. A single SELECT shows one instant. For a query whose memory footprint varies over execution — which is the common case — a single sample is insufficient. Reconstruction of the footprint over time requires repeated sampling.

Structural shape

while running:
    rows = query_memory_counters(thread_id)
    record(rows, timestamp=now)
    sleep(interval_ms)

Three knobs:

  1. Interval — the Dicken post uses 250 ms in the minimal script and a configurable --frequency (default 500 ms) in the visualisation script. Shorter = finer temporal resolution at the cost of more overhead per sample.
  2. Top-N filterORDER BY current_number_of_bytes_used DESC LIMIT N to reduce per-sample output. The minimal script uses LIMIT 4; the visualisation script drops the LIMIT and post-filters.
  3. Sliding window — how many samples to retain in memory before discarding old ones. The visualisation script retains 50 samples (at 500 ms = 25 s of history).

Rendering options

  • Text — print each sample to stdout (minimal script): ## Memory usage at time 4250 ## \n innodb/row0sel -> 25.22Kb \n ...
  • Live plot — stream each sample into an interactive matplotlib stackplot (visualisation script). See patterns/live-visualization-of-sampled-metrics.
  • CSV / JSON dump for post-hoc analysis — mentioned by the post as an alternative not walked through.

Failure modes

  • Aliasing — memory spikes shorter than the sampling interval are missed entirely. For a 100 ms spike on a 500 ms sampler, there's an 80 % chance of missing it.
  • Query completes between samples — short queries never appear in the record. Dicken flags this: the pattern is useful only for "longer-running queries, ones that take multiple seconds or minutes."
  • Observer overhead — the sampling query itself executes on its own thread, allocates its own buffers, and shows up in memory_summary_*_by_event_name rows for that thread. It does not pollute the target thread's counters because of the per-thread grain (see concepts/memory-profiling-granularity).
  • Counter saturationcurrent_number_of_bytes_used is a signed 64-bit counter; wraparound is a theoretical concern not a practical one.

Generalisation

The same shape applies to any sampled-counter observability source with no built-in history:

  • /proc/<pid>/status for Linux process memory
  • /proc/<pid>/stat for CPU time
  • SHOW ENGINE INNODB STATUS snapshots
  • pg_stat_activity for Postgres session state
  • CloudWatch custom metrics pulled via GetMetricData

The pattern is structurally what a Prometheus scrape does at ~15 s interval across a fleet; this is the same thing at 250 ms interval against a single session.

Seen in

  • PlanetScale's Profiling memory usage in MySQL (2024-04-11). Canonical instance: two Python scripts sample performance_schema.memory_summary_by_thread_by_event_name at 250 ms (minimal) or configurable (visualisation), with a 50-sample sliding window in the visualisation case. (Source: sources/2026-04-21-planetscale-profiling-memory-usage-in-mysql.)
Last updated · 470 distilled / 1,213 read