Skip to content

CONCEPT Cited by 1 source

Query digest

A query digest is a normalised form of a SQL statement with literal values stripped, used as the unit of workload aggregation in observability tooling. Two SQL executions with the same digest are treated as instances of the same query for the purpose of counting, timing, and prioritising — regardless of what specific literal values were passed.

(Source: sources/2026-04-21-planetscale-identifying-and-profiling-problematic-mysql-queries.)

Why it exists

A production MySQL host may see millions of distinct SQL strings per day (each different parameter value produces a different string) but only a few hundred distinct shapes. Aggregating by shape collapses a high-cardinality stream into a low-cardinality workload fingerprint that humans can inspect.

The PlanetScale post opens with this framing: a "large application that executes numerous queries on the database" produces "thousands of query patterns" — the digest is the pattern.

MySQL's digest in performance_schema

The canonical digest table is performance_schema.events_statements_summary_by_digest, keyed by (SCHEMA_NAME, DIGEST). Per-digest aggregates:

  • COUNT_STAR — execution count
  • SUM_TIMER_WAIT — total picoseconds across all executions
  • MIN_TIMER_WAIT / AVG_TIMER_WAIT / MAX_TIMER_WAIT — per-execution bounds
  • SUM_LOCK_TIME — total lock-wait picoseconds
  • DIGEST_TEXT — the normalised form of the SQL

Worked example from the post:

SELECT * FROM performance_schema.events_statements_summary_by_digest
  WHERE schema_name = 'game'
  ORDER BY avg_timer_wait DESC
  LIMIT 1 \G;
-- DIGEST_TEXT | SELECT `p1` . `username` , `m` ...
-- COUNT_STAR  | 2486
-- AVG_TIMER_WAIT | 171304165000   (~0.17 seconds)

Sort dimensions for prioritisation

Different sort orders answer different diagnostic questions:

  • ORDER BY SUM_TIMER_WAIT DESC — where is the workload spending its time? (cumulative burn)
  • ORDER BY AVG_TIMER_WAIT DESC — which individual queries are slow? (per-execution cost)
  • ORDER BY COUNT_STAR DESC — which queries run most often? (candidates for caching or batching)
  • ORDER BY rows_examined_avg DESC (via sys.statements_with_runtimes_in_95th_percentile) — candidates for missing indexes

This is patterns/digest-based-query-prioritization — pick the sort axis that matches the fix you're looking for.

Digest grouping limits

  • Literal-stripped, not semantic-equivalent. Two SQL strings with the same logical meaning but different column lists or table aliases produce different digests.
  • Schema-scoped. The same SQL text against different schemas is different digests.
  • Normalisation is MySQL's choice. Not every engine digests the same way; cross-engine comparison requires care.

Seen in

Last updated · 378 distilled / 1,213 read