Skip to content

PLANETSCALE 2023-03-29 Tier 3

Read original ↗

PlanetScale — How to read MySQL EXPLAINs

Summary

A PlanetScale pedagogical deep-dive by Savannah Longoria (Technical Solutions team, published 2023-03-29) on how to read and act on the output of MySQL's EXPLAIN keyword. The post is the wiki's canonical reference for what every column in EXPLAIN output actually means and how to use them together to diagnose slow queries and verify index usage. Longoria walks the full column schema — id, select_type, table, partitions, type (the access-type ladder from NULL/const/eq_ref/refrange/indexALL), possible_keys, key, key_len, rows, filtered, Extra — with the article's central operational claim: "the fewer rows your query accesses, the faster your queries will be," and the rows × filtered arithmetic that tells you how many rows MySQL expects to feed into the next join stage. Introduces EXPLAIN ANALYZE (MySQL 8.0.18+) — which actually runs the query to produce real per-iterator timings alongside the planner's estimates (time to first row, time executing the iterator, actual rows, loops) — and warns this is unsafe on live data or mutations. Canonicalises the type column as the "access type" (not "join type" as the MySQL manual calls it — the column appears even on queries with no joins) and renders the performance ladder with traffic-light colour coding: green (NULL/system/const/eq_ref/ref), yellow (fulltext/index/range), red (ALL). The worked climax is the Kendra Puppo composite-index pattern on the MySQL Employees sample database: a two- predicate WHERE last_name = 'Puppo' AND first_name = 'Kendra' query runs type: ALL, rows: 299,202 unindexed, flips to type: ref, rows: 1 after CREATE INDEX fullnames ON employees(last_name, first_name) — a 300,000× reduction in rows examined. The "phone book inside a phone book" mental model for composite indexes is from this post, as is the canonical wiki explanation of why two single-column indexes underperform a single composite index for AND-predicates. Minor sections cover SHOW WARNINGS; rewriter hints when EXPLAIN can't parse the query (<index_lookup>, <if>, <primary_index_lookup>, <temporary table> markers), EXPLAIN's known limitations (no triggers / stored functions / UDFs; stats-based row estimates can be far off; collapsing "filesort" / "Using temporary" ambiguity), and the full Extra column value dictionary (Using index, Using where, Using temporary, Using filesort, Backward index scan, Using index for group-by, etc.). Complements Justin Gage's July-2022 pedagogical piece — Gage names EXPLAIN and SHOW INDEX as the operational tools; Longoria's 2023 post is the field-manual for what their output means. Together with Ben Dicken's B+tree post they form the canonical PlanetScale pedagogy trilogy: Gage (what are indexes?), Dicken (how are they stored?), Longoria (how do you inspect whether they're being used?).

Key takeaways

  1. EXPLAIN is the query-plan inspection tool — it does not run the query. Prepend the keyword to any SELECT / UPDATE / DELETE / TABLE statement and MySQL emits the plan with estimated costs. The output per plan row describes how MySQL would execute the query — the access method chosen per table, which indexes it could use, which one it did choose, an estimate of rows scanned, and any extra work (temporary tables, filesorts, etc.). Canonicalised as concepts/mysql-explain. Verbatim from the post: "When you prepend the EXPLAIN keyword to the beginning of a query, it explains how the database executes that query and the estimated costs."

  2. The type column is the single highest-signal field. Longoria renders it as a colour-coded performance ladder (green → yellow → red) that's the fastest triage tool for an EXPLAIN plan:

  3. 🟢 NULL (resolved at optimisation), system (one row or empty), const (PK/unique-index lookup), eq_ref (unique lookup per join row), ref (index equality lookup).
  4. 🟡 fulltext (join via fulltext index), index (full index scan), range (bounded index scan).
  5. 🔴 ALL (full table scan). Plus out-of-ladder special types: index_merge (MySQL intersecting / unioning multiple indexes on one table), unique_subquery / index_subquery (IN-subquery optimisations). The article explicitly reframes the MySQL-manual name "join type""access type" because the column fires even without joins. Canonicalised as concepts/mysql-access-type.

  6. key_len is the only way to tell partial composite- index usage. When MySQL chooses a composite index, the key column names it but doesn't reveal how many columns of the composite it used. key_len (bytes) is the only signal: if the composite is on (last_name VARCHAR(16), first_name VARCHAR(14)) and key_len is about the length of just last_name, the plan is using only the leading column — the first_name predicate is post-filtering, not index-binding. Canonicalised as an inspection rule in concepts/mysql-explain and patterns/composite-index-for-and-predicate. Verbatim: "When MySQL chooses a composite index, the length field is the only way you can determine how many columns from that composite index are in use."

  7. rows × filtered ≈ rows actually fed into the next join. rows is the planner's estimate of rows MySQL must scan to satisfy the predicate for this table; filtered is a pessimistic percentage estimate of how many of those survive the WHERE / join predicate. Their product approximates the row count flowing into the next step. "If you multiply the rows column by this percentage, you will see the number of rows MySQL estimates it will join with the previous tables in the query plan." The article's rows column is also the reader's primary feedback loop for index tuning: "This column displays how many rows MySQL accessed to complete a request, which can be useful when designing indexes. The fewer rows your query accesses, the faster your queries will be."

  8. EXPLAIN ANALYZE runs the query to produce real numbers — unsafe in production. Introduced in MySQL 8.0.18, EXPLAIN ANALYZE executes the statement and reports six metrics per iterator: estimated execution cost, estimated rows, time to first row, time spent in iterator (average per loop when looped), actual rows, number of loops. The estimates-vs-actuals divergence is the diagnostic signal — a planner expecting 10 rows from an iterator that returned 10M is why the query is slow. The hazard: "EXPLAIN ANALYZE actually runs the query, so if you don't want to run the query against your live database, do not use EXPLAIN ANALYZE." Auto-selects FORMAT=TREE; raw-JSON format is not supported. Supports SELECT, multi-table UPDATE, DELETE, TABLE statements. Canonicalised as concepts/mysql-explain-analyze.

  9. The canonical composite-index before/after — 299,202 rows → 1 row. The post's worked example on the MySQL Employees sample dataset is the wiki's canonical demonstration of how to verify a composite index via EXPLAIN:

SELECT * FROM employees
WHERE last_name = 'Puppo' AND first_name = 'Kendra';

Before indexing: type: ALL, rows: 299,202 — a full table scan because MySQL has no index covering both predicates.

Two single-column indexes (anti-pattern): explicitly named and rejected — "MySQL knows how to find all employees named Puppo … all employees named Kendra … However, it doesn't know how to find people named Kendra Puppo." Index Merge may or may not fire; "even if it does, in many scenarios, they won't serve the purpose as well as a dedicated index."

Correct fix: CREATE INDEX fullnames ON employees(last_name, first_name). After re-running EXPLAIN: type: ref, rows: 1, key: fullnames. Same result, 300,000× less work. Canonicalised as patterns/composite-index-for-and-predicate and the "phone book inside a phone book" mental model as concepts/composite-index.

  1. SHOW WARNINGS; surfaces rewriter hints when the query doesn't parse. If EXPLAIN refuses to run because the query is malformed, SHOW WARNINGS; reports MySQL's best attempt at partial structural observations via special markers: <index_lookup> ("an index lookup would happen if the query had been properly parsed"), <primary_index_lookup> (primary key variant), <if>(cond, expr1, expr2), and <temporary table> ("an internal table would be created here for saving temporary results — for example, in subqueries prior to joins"). Not a substitute for a valid EXPLAIN, but an escape hatch for debugging parse-failing queries.

  2. EXPLAIN is an approximation — explicit limitations. The post is candid: "EXPLAIN is an approximation. Sometimes it's a good approximation, but at other times, it can be very far from the truth." Limits enumerated:

  3. No visibility into triggers, stored functions, or UDFs.
  4. Doesn't work for stored procedures.
  5. Doesn't reflect runtime optimisations MySQL performs during execution.
  6. Statistics-based estimates can be severely wrong (see rows, filtered).
  7. Collapses distinctions: "filesort" covers both in-memory and on-disk sorts; "Using temporary" covers both in-memory and on-disk temp tables. PlanetScale-specific caveat: "PlanetScale does not support Triggers, Stored Procedures, and UDFs" — see MySQL compatibility docs.

  8. The Extra column dictionary is the long tail of diagnostic signals. Most-common values documented:

  9. Using index — covering-index / index-only scan (the second B+tree fetch is skipped).
  10. Using where — storage engine returned rows; MySQL post-filters them.
  11. Using temporary — MySQL materialises a temp table during sort (expensive).
  12. Using filesort — external sort rather than reading in index order.
  13. Range checked for each record — no good index, indexes re-evaluated per row in a join (very bad).
  14. Using index condition — index-condition pushdown: read index tuples, test them, then conditionally fetch full rows.
  15. Backward index scan — descending index scan (MySQL 8.0+).
  16. Using index for group-byGROUP BY answered from index without materialising groups.
  17. No tables used — query has no FROM. Each is a named diagnostic hook for what the optimiser decided to do.

Systems / concepts / patterns extracted

Systems: - systems/mysql — the canonical OLTP store the article targets. - systems/innodb — the storage engine whose B+tree clustered-index shape makes the composite-index example structurally meaningful. - systems/planetscale — the article mentions the PlanetScale CLI and web UI for SHOW TABLES; / running EXPLAIN interactively, and the PlanetScale-specific compatibility note (no Triggers / Stored Procedures / UDFs). - systems/planetscale-insights — mentioned in closing: "our Insights feature + EXPLAIN statement in MySQL can be of massive assistance when you need to optimize the performance of your queries."

Concepts (new to the wiki): - concepts/mysql-explain — canonical wiki reference for EXPLAIN's column schema (id / select_type / table / partitions / type / possible_keys / key / key_len / rows / filtered / Extra), the rows × filtered arithmetic, SHOW WARNINGS; markers, limitations, and output-format options (tabular / vertical / tree / JSON). - concepts/mysql-explain-analyze — the 8.0.18+ execution-profiling variant with per-iterator real timings; safety warnings; FORMAT=TREE constraint. - concepts/mysql-access-type — the type column as access-method ladder; colour-coded green/yellow/red; index_merge / unique_subquery / index_subquery out-of-ladder variants. - concepts/composite-index — multi-column index with the phone-book-inside-a-phone-book mental model and leftmost-prefix rule.

Patterns (new to the wiki): - patterns/composite-index-for-and-predicate — canonical before/after for AND-predicate queries: type: ALL, rows: Ntype: ref, rows: 1 via CREATE INDEX (col1, col2, …). Includes column-ordering guidance and the two-separate-indexes anti-pattern.

Operational numbers

  • 300k rows → 1 row — the Employees sample DB's employees table scanned 299,202 rows on the unindexed WHERE last_name='Puppo' AND first_name='Kendra' query; fell to 1 row after CREATE INDEX fullnames ON employees(last_name, first_name). The rows column is the reader's primary feedback loop.
  • MySQL 8.0.18 — minimum version for EXPLAIN ANALYZE.
  • 4 statement shapes supported by EXPLAIN ANALYZE: SELECT, multi-table UPDATE, DELETE, TABLE.
  • 6 per-iterator metrics emitted by EXPLAIN ANALYZE: estimated execution cost, estimated rows, time-to-first- row, iterator execution time, actual rows, loop count.
  • 4 special SHOW WARNINGS markers when EXPLAIN fails: <index_lookup>, <primary_index_lookup>, <if>(cond, a, b), <temporary table>.

Caveats

  • 2023 post, re-surfaced 2026-04-21. The content is accurate as of MySQL 8.0 (and mostly earlier); future MySQL versions may extend the Extra dictionary or the supported output formats. EXPLAIN ANALYZE has been extended with JSON output in later 8.0.x patch releases — check current MySQL docs if raw-JSON output is required.
  • Tier-3 PlanetScale post, but firmly inside the pedagogy-cleared subset. Authored by a Technical Solutions team member, not an engineering byline (Dicken / Lambert / Burlacu) that default-includes. Clears Tier-3 scope on strength of (a) canonicalising four new concept pages (mysql-explain, mysql-explain-analyze, mysql-access-type, composite-index) and one new pattern that were absent from the wiki, (b) supplying the canonical worked example with real numbers (299,202 → 1 rows), (c) being the natural "how do I verify index usage?" companion to the already-ingested Gage / Dicken index-pedagogy pair.
  • No architecture diagrams, no production-incident narrative, no scale story — this is pedagogy, not post-mortem. Architecture density as defined in the scope filter is effectively 100% — the article IS database-internals education, it just isn't system- architecture education. Fits the "production-system content" Tier-3 carve-out for database internals.
  • PlanetScale-specific narrowing of applicability. The PlanetScale note that Triggers, Stored Procedures, and UDFs are unsupported is a hosted-service constraint; EXPLAIN's failure to cover them is a MySQL engine limitation. On self-hosted MySQL, the same EXPLAIN limitations apply — and profilers like performance_schema are the complement.
  • Composite-index column order under-discussed. The post gives the (last_name, first_name) example but doesn't deeply cover when column order matters (leading column always filtered vs selectivity-based ordering vs range-predicate breaking the prefix). The pattern page captures the column-order guidance the post implies.

Source

Last updated · 378 distilled / 1,213 read