Skip to content

PLANETSCALE 2024-08-14 Tier 3

Read original ↗

PlanetScale — Tracking index usage with Insights

Summary

Rafer Hazen (PlanetScale, 2024-08-14; re-fetched 2026-04-21) announces per-query-pattern index-usage tracking in PlanetScale Insights — a new Indexes tab showing a time series of which indexes each query pattern used, plus two new search predicates (index:table.index for "which queries use this index?" and indexed:false for "find unindexed patterns"). The post also discloses the VTGate-level mechanism that makes the feature possible: PlanetScale patched the InnoDB storage-engine handler's index-initialization function to record the index name in a per-query data structure, and returns the set of used indexes in the final MySQL packet alongside the normal query result. VTGate then aggregates per-query-pattern index counts and emits them into the Insights pipeline every 15 seconds. This is the wiki's first canonical disclosure of a handler-hook + wire-protocol- sidecar pattern for per-query index-usage telemetry, and the second canonical Insights surface disclosure following the 2023-04-20 fingerprinting-substrate post by the same author.

Key takeaways

  • Indexes tab: per-query-pattern time-series of which indexes were used. "The main graph shows a time series of the percentage of queries that use each index. The bar graph at the bottom shows cumulative usage over the entire time span. In the example above we see that MySQL is selecting from one of three indexes each time it executes the query pattern." Canonical productisation of concepts/per-query-used-index-set as a time-series surface — the per-execution set of indexes touched, aggregated across all executions of a query pattern. Canonicalised as concepts/index-usage-time-series. (Source: sources/2026-04-21-planetscale-tracking-index-usage-with-insights.)

  • Bidirectional query↔index attribution. "This view shows us which indexes a given query is using, but we can also ask the question in reverse: which queries are using a given index? To answer this, go to the main Insights page, and add index:$TABLE_NAME.$INDEX_NAME as a search term." Canonical wiki disclosure of bidirectional attribution as a first-class observability primitive — the index:products.idx_price search predicate inverts the per-pattern view into a per-index view. Sister to the tag:key:value + actor:X predicates canonicalised on prior Insights ingests. Canonicalised as patterns/bidirectional-index-query-attribution.

  • indexed:false as observability primitive. "We can also use insights to find a list of query patterns that use no indexes at all by entering indexed:false in the insights search box." Productises the "full-table-scan hunter" workflow that patterns/index-usage-per-table-diagnostic canonicalised at the performance_schema.table_io_waits_summary_by_index_usage altitude. Composable with other predicates: the worked example in the post combines three axes — "to find a list of all unindexed queries that have been executed at least a thousand times in the last 24 hours and have a p50 response time over 250ms, you'd enter indexed:false query_count:>1000 p50:>250" — canonical multi-predicate filter shape that surfaces high-volume + slow + unindexed patterns as a single search query.

  • VTGate mechanism: hook into InnoDB's index_init. "Since PlanetScale databases exclusively use the InnoDB storage engine, we were able to focus our efforts there. The InnoDB storage handler includes an index-initialization function that MySQL calls (once) prior to using an index in a query. By recording the index name passed to this function in a per query data structure, we're able to find the set of all indexes used by each query." Canonical disclosure of the storage-handler-hook capture point. InnoDB's index_init() is invoked once per (query, index) pair at plan-execution start; intercepting the hook and appending the index name to a per-query list gives an exhaustive set of indexes touched per execution with no sampling. Canonicalised as concepts/innodb-index-init-hook.

  • Wire-protocol sidecar for telemetry. "When the query is finished we return the list of used indexes in the final packet returned by MySQL to the client, and ultimately to VTGate, Vitess's query proxying layer." Rather than exporting index-usage out-of-band (performance_schema polling, separate log channel), the per-query index set rides along on the existing query-response wire-protocol packet. Zero extra round-trips, zero extra connections, zero sampling. VTGate aggregates the sidecar data per-pattern and ships it into the Insights pipeline every 15 seconds — the same cadence canonicalised by the 2023-08-10 Insights sharded-MySQL pipeline post. Canonicalised as patterns/handler-hook-sidecar-telemetry — generalisable to any storage engine with a handler-layer abstraction + any protocol that has a final-packet or trailer-frame slot the sidecar can ride in.

  • EXPLAIN and performance_schema named as the workflow predecessors — with their limits disclosed. Two tools canonicalised as inadequate and why:

  • EXPLAIN: "Explain is an incredibly powerful tool that exposes a wealth of information about how MySQL executes your query, including index usage. It's great for troubleshooting a problematic query or testing out a new index. Unfortunately, explain can only provide information for queries you explicitly provide. It doesn't record information for the actual queries processed by MySQL, so it can't show how a query pattern is using indexes over time, across shards, or with the different query parameters from your production workload."
  • performance_schema.table_io_waits_summary_by_index_usage: "Stats are in the form of global counts for each MySQL server, and are reset when MySQL is restarted. This means you can't see usage trends over time, and counts may be reset at any time. Counters are only provided at the index level, so it's not clear which query patterns are using which indexes." Both limitations are load-bearing: EXPLAIN is query-specific but not production-sampled; table_io_waits_summary_by_index_usage is production-sampled but not query-attributed and not time-series. Insights's contribution is production-sampled + query-pattern-attributed + time-series — the three-axis completion of the predecessor tool taxonomy. Sister to Dicken's 2024-03-29 canonical MySQL-native workflow post (sources/2026-04-21-planetscale-identifying-and-profiling-problematic-mysql-queries) where the same limitation is framed closing-line: "Many of these same observations can be gathered much easier using PlanetScale Insights."

  • SELECT-only scope: critical caveat before dropping an index. "Note: index information is only reported for SELECT queries, so it's important to independently verify that indexes aren't being used in UPDATE or DELETE queries before removing an index." Canonical load-bearing gap — Insights index-usage telemetry does not cover DML. An index that appears zero-used according to Insights may still be holding up a production UPDATE or DELETE path via predicate evaluation or row lookup. This caveat is the safety invariant that turns "drop unused indexes" from a one-command action into a two-step verification workflow. Canonicalised as concepts/select-only-index-telemetry-caveat.

  • Aggregation pipeline reuses the 15-second Kafka cadence. "With the per-query index information in VTGate, we aggregate index usage information per query-pattern and send it into the Insights pipeline every 15 seconds. This approach allows us to aggregate the time series count of indexes used for 100% of queries with negligible overhead in MySQL." 100% coverage, no sampling at the MySQL-server tier — offered as the quantitative framing for "negligible overhead". Ties into the canonical Insights pipeline architecture from the 2023-08-10 storing-time-series post: aggregate-summary Kafka topic every 15s → in-memory coalescing by hash(database_id, fingerprint) → per-hour + per-minute rollup tables in sharded MySQL → custom loadable C++ functions for percentile reads. This post's index-usage telemetry joins that pipeline as an additional aggregate-summary axis keyed on the same per-pattern fingerprint.

  • Feature availability: all PlanetScale databases. "Index usage information is available on all PlanetScale databases." No Enterprise gate; no opt-in toggle. Ships as a default observability surface on every cluster.

Systems / concepts / patterns extracted

Systemssystems/planetscale-insights (the Indexes tab is a new UI surface on Insights), systems/vtgate (VTGate is where the per-query index sets returned by MySQL are aggregated per-pattern and emitted into the Insights pipeline — load-bearing framing that extends VTGate's role beyond routing/proxying to telemetry-aggregation), systems/vitess (parent system under which VTGate sits), systems/mysql (the server producing the raw index-usage sidecar packets), systems/innodb (the storage engine whose handler index_init hook is the capture point), systems/planetscale (product).

Concepts (new)concepts/per-query-used-index-set (the per-execution list of indexes touched; the granularity primitive on which the feature depends), concepts/innodb-index-init-hook (the storage-handler index_init() function as the capture point), concepts/index-usage-time-series (the per-query-pattern time-series of index-use percentages as a distinct observability primitive from static EXPLAIN output or global performance_schema counters), concepts/select-only-index-telemetry-caveat (the DML gap).

Concepts (extended)concepts/query-pattern-fingerprint (index-usage telemetry is per-fingerprint, reusing the 2023-04-20 substrate), concepts/mysql-performance-schema (load-bearing predecessor whose limitations this post names: global-count + no-time-series + no-query-attribution), concepts/secondary-index (new production-observability context for the secondary-index concept: per-pattern time-series of usage is the new diagnostic signal for deciding which secondary indexes are load-bearing), concepts/observability (new observability primitive layered on top of the existing Insights tag/filter language).

Patterns (new)patterns/handler-hook-sidecar-telemetry (the generalisable pattern: intercept at the storage-engine handler, bolt the result onto the wire-protocol response packet, aggregate downstream — zero extra round-trips, zero sampling, 100% coverage), patterns/bidirectional-index-query-attribution (per-query-pattern → indexes; per-index → query patterns; both directions as first-class UI predicates).

Patterns (extended)patterns/index-usage-per-table-diagnostic (Insights indexed:false + Indexes tab is the productised UX over the same underlying "which table is missing an index?" question the table_io_waits_summary_by_index_usage pattern answered at the performance_schema altitude — but per-query-pattern, not per-table, and over-time, not since-last-restart), patterns/digest-based-query-prioritization (the indexed:false query_count:>1000 p50:>250 multi- predicate filter is the productised form of the "sort digests by rows-examined / avg-timer" workflow Dicken canonicalised for the native MySQL substrate).

Operational numbers

  • Telemetry cadence: aggregate index-usage counts flushed from VTGate into the Insights pipeline every 15 seconds (same as all other per-pattern aggregate axes — coverage, rows, exec-time, sketch).
  • Coverage: 100% of SELECT queries (no sampling, "negligible overhead" qualitative).
  • Scope: SELECT onlyUPDATE / DELETE queries do not contribute to index-usage telemetry.
  • Availability gate: all PlanetScale databases; no Enterprise tier, no opt-in.
  • Hook point: InnoDB's index_init() is called once per (query, index) pair prior to use — the hook fires once per index touched, not once per row.
  • Wire-protocol channel: index-usage set rides along in the final packet returned by MySQL to the client (and then to VTGate).
  • Storage-engine scope: InnoDB only — "Since PlanetScale databases exclusively use the InnoDB storage engine, we were able to focus our efforts there." MyISAM, MEMORY, CSV etc would require separate handler patches.

Caveats

  • Short post (~1,100 words), product-announcement framing — architecture disclosure is compressed into two paragraphs at the bottom. The actual code path ("record the index name passed to this function in a per query data structure" + "return the list of used indexes in the final packet") is sketched at one altitude without disclosing data structures, packet format, or the server-side patch surface. The Vitess community-facing Vitess/MySQL fork integration isn't discussed — this patch presumably lives in PlanetScale's proprietary MySQL fork, not upstream Vitess or MariaDB.
  • SELECT-only gap is real operational danger. The "before removing an index, check DML" caveat sits one line from the end. An operator who sees indexed:false showing zero hits on a secondary index and drops it without independently running EXPLAIN on the UPDATE / DELETE paths may have just broken a production write path. The post doesn't canonicalise the DML-verification workflow — it flags the gap and moves on. Operationalising "check DML before drop" across a real codebase typically means auditing ORM query logs + grepping the migration history + running EXPLAIN on every write path that touches the table.
  • Cross-shard aggregation elided. The post says "VTGate aggregates" without discussing how a single logical query pattern that fans out across 8 shards is aggregated. Presumably each shard's MySQL returns its own index-usage set; VTGate merges per-pattern per-15s; the resulting time-series is "percentage of executions that used index X" summed across shards. Not stated.
  • EXPLAIN-vs-Insights tension understated. The post frames EXPLAIN as "great for troubleshooting a problematic query or testing out a new index" — which understates EXPLAIN's canonical role as the verification tool before dropping an index. The DML-verification workflow the post calls out at the bottom ultimately reduces to "run EXPLAIN on the representative DML queries"EXPLAIN remains load-bearing in the operator's toolkit; Insights complements rather than replaces it.
  • No quantified EXPLAIN / performance_schema head-to-head. The limits of the two predecessor tools are named qualitatively. No numbers on how many EXPLAIN runs a typical operator does in a week vs how many queries Insights captures per minute; no quantified "how much better" the time-series surface is vs the static snapshot surface. The quantified gap ("thousands to millions of queries per second" coverage from a prior post, if we read it forward) is implicit, not stated.
  • No production-storey incident trace. Unlike the 2022-09-27 error-debugging-with-Insights post (which walks a real database_branch_password AlreadyExists incident) or the 2023-04-20 fingerprinting post (which walks the 100M-row DELETE ... LIMIT 500 latency storey), this post has no production case study. The screenshots are anonymised / stylised; no real incident narrative.
  • Rafer Hazen byline — Hazen is PlanetScale's canonical Insights-architecture voice. Prior Insights ingests: 2023-04-20 fingerprinting (the substrate this post builds on), 2023-08-10 storing-time-series (the pipeline this post's aggregates flow through), 2023-11-28 anomalies, 2026-03-24 enhanced-tagging, 2026-04 AI-index-suggestions. This 2024-08-14 post sits chronologically between the 2023 pipeline build-out and the 2026-era tagging / AI surfaces — the substrate for the AI-index-suggestions feature's "rows-read vs rows-returned" ratio input oracle depends on the per-index-usage telemetry canonicalised here.
  • Re-fetch gap: original publication 2024-08-14; RSS re-fetch 2026-04-21; 20-month gap. Insights has since evolved with tagging + AI index suggestions + dual-stream Kafka→ClickHouse architecture. The Indexes tab as described presumably still works, but the pipeline substrate beneath it has migrated (sharded MySQL → ClickHouse per the 2026-03-24 post). The handler-hook + wire-protocol sidecar mechanism is substrate-invariant — it captures data into whatever pipeline Insights uses downstream.

Source

Last updated · 470 distilled / 1,213 read