CONCEPT Cited by 1 source
Schema dependency graph¶
Definition¶
A schema dependency graph is the directed graph whose nodes are schema entities (tables, views, columns, indexes, foreign-key constraints, check constraints, triggers) and whose edges encode the "X references Y" relationships between them. A view referencing columns of a table points to those columns and indirectly to the table. A foreign-key constraint points to the parent table's primary key. An index points to its columns.
The graph is the substrate
schemadiff walks to
validate schema correctness (no dangling references,
no cyclic views) and to partition diff statements into
equivalence
classes for the
near-atomic
multi-change deployment model.
Validation obligations¶
schemadiff uses the graph to enforce structural
correctness before any DDL executes:
- Reference completeness: every column / table / index referenced by a view or FK must exist in the target schema.
- Acyclic views: no view
v1may read fromv2ifv2reads fromv1(directly or transitively). - Type compatibility: FK columns must have types compatible with their parent-table primary-key columns.
- Intermediate validity: during a multi-step DDL sequence, every intermediate schema state must also be valid (a rename step cannot leave the graph dangling-referenced at any point).
Canonical verbatim: "When schemadiff reads a schema, it
maps and validates any dependency between entities. For
example, it can validate that table and columns referenced
by some view exist or that there are no cyclic view
definitions (v1 reads from v2, which reads from
v1)." (Source:
sources/2026-04-21-planetscale-deploying-multiple-schema-changes-at-once)
Dependency edges in the canonical Vitess domain¶
schemadiff handles (at minimum) these dependency types:
- Column → Table: every column belongs to exactly one table.
- Index → Column(s): an index references one or more columns of its table.
- View → Columns (of underlying table(s)): views are
query-defined; every column or expression in the view's
SELECTresolves to columns of base tables. - View → View: nested views reference other views.
- Foreign key → Parent table: an FK constraint references a parent table's primary key.
- Check constraint → Columns: a check constraint references columns of its table.
- Charset / collation hierarchy: column charset inherits from table, which inherits from database, which inherits from server — see character set and collation for the four-level inheritance hierarchy.
Diff → sub-graph → equivalence class¶
A schema diff (e.g. ALTER TABLE t DROP COLUMN info)
mutates specific nodes in the graph. schemadiff
overlays the diffs onto the graph and identifies which
diffs touch entities with dependency relationships. Two
diffs whose entities share a dependency chain end up in
the same
equivalence
class; cross-class diffs are independent.
Worked example from the post:
ALTER TABLE t ADD COLUMN info VARCHAR(128)— mutates thettable node, adding aninfocolumn sub-node.ALTER VIEW v AS SELECT id, info FROM t— mutates thevview node; new definition references bothidandinfoont.
Edges: v → info (new); v → id (already existed). The
view diff depends on the table diff. Both diffs are in the
same equivalence class; valid order is table-first, view-
second.
Generalisation to nested views¶
"The scenarios may be more complex when multiple, nested views are involved, which are based on yet multiple tables being changed in the deployment request." The graph framework naturally generalises:
- Every entity reachable via dependency chains from a diff's target node is in the same equivalence class.
schemadiffcomputes transitive reachability to determine class membership.- Within each class, topological sort respecting the graph edges gives a valid execution permutation.
The algorithm does not fail on nested views per se — it fails on graphs where no valid permutation exists (cyclic constraints, mutually-referencing FK adds) or on deploy- request branches whose complexity exceeds a "reliably safe path" threshold.
Relationship to general DAG / DBMS theory¶
The schema dependency graph is a specialisation of the general directed acyclic graph (DAG) widely used in database-internals contexts:
- Query-planner join graphs — different edges (JOIN predicates), same algorithms.
- Module / package dependency graphs — software build
systems (Bazel,
cargo) solve the same topological- sort-under-validity-constraint problem at a different altitude. - Foreign-key constraint graphs — the DBMS itself uses these for cascade-delete ordering, integrity checks, and referential-integrity enforcement during DML.
What makes the schema dependency graph distinct in the
schemadiff context is the
diff-overlay — classical DAGs assume node set is
static; schemadiff's problem is "how does this set of
mutations partition the graph into independent sub-
problems?"
Seen in¶
- sources/2026-04-21-planetscale-deploying-multiple-schema-changes-at-once — canonical wiki framing. Shlomi Noach introduces the graph implicitly through the dependency-validation language ("maps and validates any dependency between entities," "cyclic view definitions") and explicitly through the diff-to-equivalence-class partition algorithm. The graph is the substrate that makes near-atomic deployment mechanically possible — without it, PlanetScale could not safely compute the ordering of multi-table migrations without hand-authored dependency metadata.