SYSTEM Cited by 2 sources
Vitess schemadiff¶
What it is¶
schemadiff is a Vitess library
(Go package in the Vitess source tree) that reads schemas,
validates dependency constraints, computes diff DDL between
two schemas, and partitions the resulting diffs into
equivalence classes with a valid in-order execution
permutation inside each class. It is the analytical
substrate underneath PlanetScale's
near-atomic multi-
change schema deployment model.
Introduced on the Vitess blog in April 2023
(vitess.io/blog/2023-04-24-schemadiff),
schemadiff is continuously extended — the
Vitess
21 release notes call out that more Online DDL analysis
("scenarios beyond the documented limitations," charset
conversion, INSTANT eligibility) is now "delegated to the
schemadiff library for programmatic power + testability."
Responsibilities¶
Schema validation¶
When schemadiff reads a schema, it maps and validates
any dependency between entities — for example, verifying
that tables and columns referenced by a view actually exist,
and that there are no cyclic view definitions (v1 reads
from v2, which reads from v1). This produces a
schema dependency graph
— nodes are schema entities (tables, views, columns,
indexes, constraints), edges are reference relationships.
Diff computation¶
Given two schemas (typically "current production state"
and "desired state from the deploy-request branch"),
schemadiff emits the sequence of DDL statements
(CREATE TABLE, ALTER TABLE, ALTER VIEW, DROP TABLE,
etc.) needed to transform the first into the second.
Dependency analysis on the diff¶
After generating the diffs, schemadiff analyses the
dependencies between the diff statements. From the
canonical source:
"If any two diff statements affect entities with a dependency relationship in the schema(s), then
schemadiffknows it needs to resolve the ordering of those two diffs. If yet another diff affects entities used by either of these two, thenschemadiffneeds to resolve the ordering of all three."(Source: sources/2026-04-21-planetscale-deploying-multiple-schema-changes-at-once)
Equivalence-class partitioning¶
Diffs are partitioned into equivalence classes — connected components of the diff-dependency graph. "All the diffs are thus divided into equivalence classes: distinct sets where nothing is shared between any two sets and where the total union of all sets is the total set of diffs." Cross-class ordering is arbitrary; within- class ordering is determined by topological sort with validity verification.
Permutation search with in-memory validity check¶
Within each equivalence class, schemadiff searches for a
permutation of the diffs that preserves schema validity at
every step of the sequence:
"For each equivalence class,
schemadifffinds a permutation of the diffs such that if executed in order, the validity of the entire schema is preserved. It's worth reiterating that changes to the underlying database can only be applied sequentially. Thus, we must validate that the schema remains valid throughout the in-order execution.schemadiffachieves this by running in-memory schema migration and validation at every step."
The in-memory validity check means the library does not execute DDL against a real database during planning — it maintains an in-memory representation of the schema and mutates it step-by-step, catching invalid intermediate states (dangling view references, missing foreign-key targets, type-incompatible FK mismatches) before any production DDL is issued.
Canonical use cases¶
PlanetScale deploy-request orchestration¶
A PlanetScale deploy-request consists of N schema changes
staged on a branch. When the deploy-request is submitted,
schemadiff:
- Computes the diff DDL from production → branch state.
- Partitions the diff into equivalence classes.
- For each class, computes a valid execution permutation.
- Hands the blueprint to the deploy controller, which runs long-running changes via VReplication staged in catch-up, and serialises immediate changes in the computed order at cut-over time.
This gives the
near-atomic multi-change deployment property: N
migrations complete "a few seconds apart" in the order
schemadiff computed, rather than hours apart in operator-
authored order.
Vitess Online DDL analysis¶
The
Vitess 21 release notes list multiple schemadiff
extensions inside the single-table Online DDL path:
ALGORITHM=INSTANTeligibility analysis beyond MySQL's documented limitations —schemadiffmodels more scenarios in-memory to determine whether a change can use the cheap metadata-only path.- Charset-change handling —
schemadiffanalyses when programmatic text conversion can replace MySQL's built-inCONVERT(... USING utf8mb4)for performance in primary- key / iteration-key columns (utf8mb4 vs utf8).
Design properties¶
- Pure-Go library, no runtime state.
schemadiffis an analytical library, not a service — it takes two schemas as strings, returns an analysis tree. No database connections, no persistent state, no coordination with Vitess control-plane components. - In-memory simulation of DDL. The validity checker mutates an in-memory schema representation and verifies each intermediate state — no real DDL is issued during planning.
- Complete dependency coverage. Table ↔ view ↔ column
dependencies, foreign-key target verification, index
structure validity, and (per Vitess 21) charset /
collation /
INSTANTeligibility are all modelled. - Complementary to the Online DDL executor.
schemadiffproduces the plan; the Vitess Online DDL executor (via VReplication orpt-online-schema-change/gh-oststrategies) executes the plan. Each layer is independently testable.
Limitations and caveats¶
- Resource-bounded. "Resources are not infinite, and only so many changes can run concurrently. Altering a hundred tables in one deployment request is not feasible and possibly not the best utilization of database branching. It is possible to go too far with a branch so that the changes are logically impossible to deploy (or rather, so complex that it is not possible to determine a reliably safe path)." (Source: sources/2026-04-21-planetscale-deploying-multiple-schema-changes-at-once)
- Permutation search complexity not disclosed. The
in-memory-validation-at-every-step algorithm's asymptotic
complexity, its behaviour on pathological graphs (cycles
the user intended but
schemadiffrejects), and its termination guarantees for very large equivalence classes are not documented in the canonical post. - "Reliably safe path" is a judgement call. The post
acknowledges some deploy-request branches can be
mechanically un-deployable by construction — no
well-defined rule is given for the boundary;
schemadiffreturns failure and the operator must decompose the branch into smaller deploy-requests. - Relies on MySQL-only DDL semantics. Schema dependency analysis is MySQL-flavoured (view semantics, FK semantics, charset / collation hierarchy). Generalisation to Postgres under PlanetScale Postgres / Neki is not disclosed as of 2026-04-21.
Seen in¶
- sources/2026-04-21-planetscale-deploying-multiple-schema-changes-at-once
— canonical first wiki disclosure of the library's role
in multi-change deployment. Shlomi Noach frames
schemadiffas the analytical substrate underneath PlanetScale's near-atomic deployment model: it partitions diffs into equivalence classes ("distinct sets where nothing is shared between any two sets and where the total union of all sets is the total set of diffs"), computes a valid permutation inside each class via in-memory schema migration and validation at every step, and hands the blueprint to the deploy controller. Canonical four-panel diagram of "given a set of diffs → group into equivalence classes → arbitrary ordering across classes → valid ordering within each class." Canonical view-drop vs view-add worked example —ALTER TABLE t DROP COLUMN info -
ALTER VIEW v AS SELECT id FROM trequires begin-t-wait -immediate-v-complete-t sequencing, not the naive "do v first" intuition. -
sources/2026-04-21-planetscale-announcing-vitess-21 — extension of
schemadiff's responsibilities in the single-table Online DDL path. Vitess 21 release notes list "moreINSTANTDDL scenario analysis beyond the documented limitations" and "charset-change handling [that] now uses programmatic text conversion rather than MySQL'sCONVERT(... USING utf8mb4)for performance in primary-key / iteration-key columns" both delegated toschemadiff"for programmatic power + testability." The April-2023 library continues to be a load-bearing extensibility point for Vitess Online DDL three years later.
Related¶
- systems/vitess
- systems/mysql
- systems/planetscale
- systems/vitess-vreplication
- concepts/schema-diff-equivalence-class
- concepts/schema-dependency-graph
- concepts/online-ddl
- concepts/near-atomic-schema-deployment
- concepts/staged-then-sealed-migration
- patterns/topological-order-by-equivalence-class
- patterns/near-atomic-multi-change-deployment
- companies/planetscale