Skip to content

PATTERN Cited by 1 source

Digest-based query prioritization

Problem. A production MySQL workload executes thousands of distinct SQL patterns daily. Manually running each query to find the slow ones doesn't scale; browsing a slow-query log lists individual executions without revealing which patterns burn the most cumulative time.

Solution. Sort the per-digest aggregate table (performance_schema.events_statements_summary_by_digest or the curated sys views over it) by a dimension matching your diagnostic question. Use the top-N result as the candidate list for drill-down via EXPLAIN, EXPLAIN ANALYZE, or stage-timing profiling.

(Source: sources/2026-04-21-planetscale-identifying-and-profiling-problematic-mysql-queries.)

Sort-axis selection

The axis you sort by determines what kind of fix you're hunting for:

Sort axis Question answered Typical fix
SUM_TIMER_WAIT DESC Where is the workload spending cumulative time? Attack the heaviest lines; even small per-execution improvements compound.
AVG_TIMER_WAIT DESC Which queries are slowest per-execution? Missing index, bad plan, lock contention.
COUNT_STAR DESC Which queries run most often? Caching, batching, application-side changes.
rows_examined_avg DESC (via sys.statements_with_runtimes_in_95th_percentile) Which queries read too much? Missing or wrong index.
via sys.statements_with_full_table_scans ORDER BY exec_count DESC Which full-table-scanning queries run most? Add an index.
via sys.statements_with_sorting Which queries do filesorts? Covering index.

Worked example

From the PlanetScale post:

SELECT substring(query,1,50), avg_latency, rows_examined_avg
  FROM sys.statements_with_runtimes_in_95th_percentile
  ORDER BY rows_examined_avg DESC LIMIT 10;

Top result: SELECT alias FROM chat.message LIMIT ? with avg_latency = 2.13 s, rows_examined_avg = 10,000,000 — an instant "add an index or rewrite" signal.

Relationship to Insights / APM

The PlanetScale post positions PlanetScale Insights as the productised version of this pattern — the same digest data surfaced as a dashboard with sort controls, anomaly detection, and drill-downs. Any hosted MySQL APM (Datadog DBM, New Relic, Percona PMM) is a packaging of this same digest primitive plus its Postgres equivalent (pg_stat_statements).

Caveats

  • Digest grouping is literal-stripped, not semantic-equivalent — SQL differing only in literals collapse; SQL differing in column list does not.
  • Only surfaces queries that ran — unexecuted problems (unused indexes, never-run candidate queries) are invisible.
  • performance_schema data is in-memory and bounded — aggregates start fresh on restart; long-tail history requires export to durable storage.
  • Sorting by average hides bimodal distributions — a digest with p50 = 5 ms and p99 = 30 s averages to potentially nothing alarming; use MAX_TIMER_WAIT as a sanity check.
  • Sort axis is a heuristic, not a guarantee — the top digest by SUM_TIMER_WAIT may be unfixable (a vendor tool's periodic stats query). Operator judgment still required for triage.

Seen in

  • PlanetScale's field manual for query diagnosis (2024-03-29). Dicken teaches the pattern twice: first via raw performance_schema.events_statements_summary_by_digest sorted by avg_timer_wait, then via sys.statements_with_runtimes_in_95th_percentile sorted by rows_examined_avg, then via sys.statements_with_full_table_scans sorted by exec_count. Each sort axis produces a different candidate list with a different natural fix. (Source: sources/2026-04-21-planetscale-identifying-and-profiling-problematic-mysql-queries.)
Last updated · 378 distilled / 1,213 read