PlanetScale — Identifying and profiling problematic MySQL queries¶
Summary¶
Ben Dicken (PlanetScale, 2024-03-29) publishes a pedagogical
field manual for native MySQL query diagnosis: how to use
performance_schema + sys tables to identify which
queries to fix, and EXPLAIN ANALYZE + stage-timing
profiling via setup_instruments / setup_consumers /
events_stages_history_long to drill into individual
query execution. The post closes by positioning
PlanetScale Insights as the
product that replaces the manual poking-around workflow with
a visualisation + anomaly-detection dashboard over the same
underlying data.
Key takeaways¶
performance_schemais an in-memory storage engine ("all of the information it tracks is stored in an in-memoryPERFORMANCE_SCHEMAstorage engine") containing ~113 tables on recent MySQL versions. It's on by default but can be disabled on memory-constrained hosts. (Source: this post.)events_statements_summary_by_digestis the entry point for "which queries are expensive?" — keyed by digest (normalised SQL with literals stripped), it exposesCOUNT_STAR,SUM_TIMER_WAIT,AVG_TIMER_WAIT,MAX_TIMER_WAIT,SUM_LOCK_TIME. Timer values are in picoseconds (divide by 1e12 for seconds).sysschema is the ergonomic front-end overperformance_schema. Canonical diagnostic tables named:statements_with_sorting,statements_with_runtimes_in_95th_percentile,statements_with_full_table_scans.table_io_waits_summary_by_index_usagesurfaces per-index hit counts including reads that bypassed every index (INDEX_NAME IS NULLrow). Worked example: 2.57 billion unindexed reads ongame.messagevs ~164k for the one used index (from_id) — diagnostic signature of missing indexes.- Stage-level profiling via
events_stages_history_longrequires three config toggles:setup_instruments.ENABLED = 'YES',setup_consumers.ENABLED = 'YES', andsetup_actors.HISTORY = 'YES'. Workflow: capturethread_idfromperformance_schema.threads, run the query, look up the statement inevents_statements_history_long, bracket with itsevent_id/end_event_id, then queryevents_stages_history_longbetween those bounds to see per-stage milliseconds (stage/sql/executing,stage/sql/optimizing,stage/sql/statistics,stage/sql/Opening tables,stage/sql/waiting for handler commit, ...). - Worked stage-profile datum: on the post's problematic
query,
stage/sql/executingconsumed 735.3 ms out of a ~736 ms total — execution-bound, not lock-bound or optimising-bound. The value of stage timing is that it would flag lock-wait or optimiser time if those were the bottleneck. - Selective profiling via
setup_actors: flip global defaults off withUPDATE setup_actors SET ENABLED='NO', HISTORY='NO' WHERE HOST='%' AND USER='%', then insert a specific(HOST, USER)row to scope instrumentation to a single test principal — avoids the performance tax of fleet-wide history tracking. EXPLAIN ANALYZEreports actual per-iterator costs alongside planner estimates — the worked query shows a nested-loop join with a full 1M-row table scan onmessagefollowed by two single-row PK lookups per row, actual 320 ms on 345,454 rows.- PlanetScale Insights is
the vendor's answer to the tedium: "gleaning this
information can be tedious. Getting exactly what you want
requires significant poking around and digging through
tables in
performance_schemaandsys." Insights provides visualisations + sort-by-rows-read + automatic anomaly detection over the same underlying digest data.
Systems extracted¶
- MySQL — target engine;
performance_schema+sysare shipped with the server. - InnoDB — storage-engine context for the index-usage table (PRIMARY / secondary keys).
- PlanetScale Insights — productised observability over the same data; Postgres + MySQL surface.
- PlanetScale — publisher; positions Insights as the managed-service alternative to manual poking.
Concepts extracted¶
- concepts/mysql-performance-schema — in-memory instrumentation subsystem; ~113 tables; in-memory storage engine.
- concepts/mysql-sys-schema — curated views over
performance_schemafor human-readable diagnosis. - concepts/query-digest — normalised SQL shape keyed by literal-stripped form; unit of workload aggregation.
- concepts/query-stage-profiling — per-execution-stage timing via
setup_instruments+events_stages_history_long. - concepts/mysql-explain-analyze — execution-profile variant already on wiki (extended).
- concepts/mysql-access-type — already on wiki; referenced for interpretation of full-table-scan findings.
- concepts/observability — extended with database-internal-telemetry axis.
Patterns extracted¶
- patterns/digest-based-query-prioritization — sort digest table by
SUM_TIMER_WAIT/AVG_TIMER_WAIT/ rows-examined-avg to select queries for investigation. - patterns/stage-level-query-profiling — three-toggle + thread-id + event-id bracket workflow for per-stage millisecond breakdown.
- patterns/index-usage-per-table-diagnostic —
table_io_waits_summary_by_index_usageWHEREobject_name=... to surface unindexed-read volume relative to per-index hits. - patterns/explain-before-execute-validation — already on wiki; referenced as the drill-down complement to digest-level identification.
Operational numbers¶
- ~113
performance_schematables on a recent MySQL. - Timer unit: picoseconds (divide by 1 trillion for seconds).
- Worked datum: 735.3 ms in
stage/sql/executingout of ~736 ms total wall-clock. - Worked datum: 2.57 × 10⁹ unindexed reads vs 164,500 indexed reads on a single table — ratio as diagnostic signal.
- Worked datum: full-table-scan query running 6,742 times accumulating 6.45 min of latency.
EXPLAIN ANALYZEworked datum: nested-loop join with table scan onm(1M rows) + single-row PK lookup onp1+ single-row PK lookup onp2, actual 320 ms for 345,454 rows.
Caveats¶
- Instrumentation has overhead — Dicken explicitly flags "(small) adverse effect on the overall performance of your system" from enabling history for all users; selective actor scoping is the mitigation.
performance_schemais in-memory — data is lost on restart; no retention beyond memory size; no long-term trend analysis without an exporter.EXPLAIN ANALYZEactually runs the query — unsafe for expensive queries on production and forUPDATE/DELETE/INSERTside effects (already canonicalised on theEXPLAIN ANALYZEwiki page).- Timer-unit surprise — picoseconds is easy to misread as nanoseconds; every derived number is 1000× off if the mistake is made.
- Digest grouping is literal-stripped — queries with different parameters are grouped; non-parameter differences (different table names, different column lists) are separate digests.
- No production deployment numbers — post is pedagogy; PlanetScale doesn't disclose how Insights' own ingestion of this data scales, nor the retention policy or sampling strategy behind the visualisations.
- Tier-3 PlanetScale pedagogical voice — Ben Dicken's fifth-plus wiki ingest (canonical database-internals educator); default-include per companies/planetscale skip rules.
Source¶
- Original: https://planetscale.com/blog/identifying-and-profiling-problematic-mysql-queries
- Raw markdown:
raw/planetscale/2026-04-21-identifying-and-profiling-problematic-mysql-queries-9efa4ae0.md
Related¶
- systems/mysql — host engine;
performance_schema+sysship with it. - systems/innodb — storage-engine context for unindexed-read diagnosis.
- systems/planetscale-insights — productised successor to the manual workflow.
- concepts/mysql-performance-schema — new canonical wiki page.
- concepts/mysql-sys-schema — new canonical wiki page.
- concepts/query-digest — new canonical wiki page.
- concepts/query-stage-profiling — new canonical wiki page.
- concepts/mysql-explain-analyze — drill-down companion.
- concepts/mysql-access-type — interpretation of full-table-scan findings.
- patterns/digest-based-query-prioritization — identify what to fix.
- patterns/stage-level-query-profiling — drill into why it's slow.
- patterns/index-usage-per-table-diagnostic — surface missing indexes.
- companies/planetscale — publisher.