Skip to content

SYSTEM Cited by 2 sources

PlanetScale Schema Advisor

What it is

Schema Advisor is the internal PlanetScale system that powers the schema recommendations feature surfaced inside PlanetScale Insights. It analyses production MySQL databases and emits actionable, DDL-ready recommendations that can be applied directly through the deploy-request workflow. Introduced publicly on 2024-02-28 by Taylor Barnett and Rafer Hazen (sources/2026-04-21-planetscale-introducing-schema-recommendations):

We've built a system that we internally refer to as the "Schema Advisor." It can make schema recommendations and understand when a schema change closes an existing open recommendation.

Architecture

Trigger: Kafka schema-change events

Schema Advisor is event-driven, not polling:

Each time a production branch's schema changes within PlanetScale, an event is emitted to Kafka. This triggers a background job to examine the schema for potential recommendations.

Canonicalised as patterns/event-triggered-schema-analysis. Every applied deploy request emits a schema-change event; the Schema Advisor consumer re-runs rule-based checks against the new schema state. This makes the advisor idempotent by schema-state (replaying events is harmless), decoupled from the schema-change apparatus (the advisor never blocks a deploy), and eventually consistent with the live schema.

Inputs

  1. Current schema — walked directly for structural recommendations (redundant indexes, primary-key exhaustion).
  2. Insights query performance telemetry — per-query-pattern statistics (rows read, rows returned, runtime share, frequency). Used to select candidate queries for missing-index suggestions and to detect unused tables.
  3. Vitess query parser + semantic analysis utilities — used offline to extract indexable columns from Insights-identified slow-query candidates. Second wiki-canonical use of the Vitess parser beyond VTGate planning.
  4. Patched MySQL fork's custom ANALYZE TABLE ... UPDATE HISTOGRAM variant — exposes per-column cardinalities without writing to the production optimizer's statistics table (so composite-index column order can be chosen without destabilising live query plans). See patterns/cardinality-extraction-for-composite-index-ordering.

Outputs

Each recommendation carries:

  • A natural-language explanation + benefits ("reduced memory and storage, decreased execution time, prevent ID exhaustion").
  • The affected schema element or query pattern.
  • The exact DDL that would apply the recommendation.
  • A "Create and apply" button that creates a PlanetScale branch with the DDL applied, so it can be tested before merging into production via deploy request.

The Schema Advisor also closes open recommendations when it observes a deployed schema change that resolves them, giving the surface a proper lifecycle (open → in-progress → closed) rather than a static firehose.

The four v1 recommendation classes (2024)

1. Add missing indexes

Daily scan of Insights query performance data:

Insights scans your query performance data daily to identify queries over the past 24 hours for frequently issued queries with a high aggregate ratio of rows read compared to rows returned. It will then parse the query to extract indexable columns, estimate each column's cardinality (number of unique values) to determine optimal column order and suggest a suitable index.

Same workload-pre-filter shape (rows-read / rows-returned ratio + frequency) as index-candidate filtering — the 2026 AI-successor formalises the threshold at ≥0.1% aggregate runtime share, but the structural filter is already in place in 2024.

2. Remove redundant indexes

Schema-only (no query telemetry required):

Exact duplicate indexes - an index that has the same columns in the same order.

Left prefix duplicate indexes - an index that has the same columns in the same order as the prefix of another index.

Canonical operational datum verbatim: "Redundant indexes are remarkably common. Our initial set of recommendations found that 33% of PlanetScale databases have redundant indexes that they may benefit from removing." Canonicalised as concepts/redundant-index-detection.

3. Prevent primary-key ID exhaustion

Daily scan of AUTO INCREMENT primary-key column values:

Insights scans all of the AUTO INCREMENT primary keys in your database schema and checks the current AUTO INCREMENT value daily to identify where you might be approaching primary key ID exhaustion. If Insights detects that one of the columns is above 60% of the maximum allowable type, it will recommend changing the underlying column to a larger type.

Also propagates to foreign keys: "Insights scans queries to parse joins and correlated subqueries to find foreign keys and suggests increasing the column size for those columns." Canonicalised as concepts/primary-key-id-exhaustion.

4. Drop unused tables

Daily scan combining schema age + query telemetry:

Insights scans your query performance data daily to identify if any tables are more than four weeks old and haven't been queried in the last four weeks.

Two-sided temporal freshness — excludes new tables AND requires recent-query-absence. Canonicalised as concepts/unused-table-detection.

Why the patched MySQL ANALYZE matters

Choosing column order for a composite index requires knowing per-column cardinality. The upstream ANALYZE TABLE ... UPDATE HISTOGRAM command writes updated statistics to the optimizer's stats table — which would risk destabilising customer query plans as a side-effect of recommendation generation. The patched variant extracts histograms without writing, keeping advisory machinery isolated from live optimizer state. Structural sibling of HypoPG-style hypothetical indexes in the 2026 AI-successor: both keep advisor machinery from perturbing production optimizer behaviour.

Canonicalised as patterns/cardinality-extraction-for-composite-index-ordering.

The 2024 → 2026 evolution

The 2026 AI-powered Postgres index suggestions feature (sources/2026-04-21-planetscale-ai-powered-postgres-index-suggestions) is the successor — same Schema Advisor substrate, same Insights UX host, same workload-pre-filter shape, but:

Dimension 2024 2026
Target engine MySQL Postgres
Candidate generator Rule-based logic + Vitess parser LLM
Validation oracle Patched ANALYZE TABLE ... UPDATE HISTOGRAM HypoPG + EXPLAIN
Recommendation classes 4 (index / redundant / PK-exhaustion / unused-table) 1 (index)
Workload pre-filter Qualitative ("frequently issued") Quantitative (≥0.1% runtime share + rows-read ratio + min frequency)
Validation pattern ad-hoc per class patterns/llm-plus-planner-validation

The structural shape is unchanged: workload-pre-filter at input, advisor-machinery-isolated-from-production-optimizer- state at output, deploy-request lifecycle for apply path.

Seen in

  • sources/2026-04-21-planetscale-introducing-schema-recommendationslaunch disclosure (Taylor Barnett + Rafer Hazen, 2024-02-28). Canonicalises all four v1 recommendation classes, the Kafka-event-driven trigger architecture, the patched-MySQL-ANALYZE cardinality-extraction mechanism, the Vitess-parser use for indexable-column extraction, the lifecycle (open → closed on deployed schema change), and the 33% redundant-index operational datum.
  • sources/2026-04-21-planetscale-ai-powered-postgres-index-suggestions2026 AI successor (Rafer Hazen). Replaces rule-based candidate generation with LLM; replaces patched ANALYZE TABLE with HypoPG EXPLAIN-cost validation; adds quantitative workload- share threshold (≥0.1%). Same Schema Advisor substrate, Postgres engine, index-suggestions scope.

Source

Last updated · 378 distilled / 1,213 read