PlanetScale — The challenges of supporting foreign key constraints¶
Summary¶
Shlomi Noach and Manan Gupta (Vitess maintainers at PlanetScale, 2023-12-05)
retrospective on the roughly-one-year engineering investment to ship
foreign-key-constraint support in PlanetScale / Vitess
while preserving Vitess's non-blocking
Online DDL,
gated deployments, and
online imports properties.
PlanetScale had originally declined to support foreign keys because the
canonical shadow-table
online-DDL mechanism is architecturally incompatible with
InnoDB's native FK implementation. The post walks layer
by layer through what broke and how it was fixed: branching + deploy-request
semantics (schema path computation with FK-ordering constraints and
revert-as-orphan-row-warning), query serving (Vitess now
owns cascade logic instead of delegating to InnoDB, because InnoDB's
cascades never reach the binlog and therefore aren't replayable by any
CDC/replication consumer that isn't InnoDB itself), two MySQL
server-level patches in PlanetScale's MySQL fork
(rename_table_preserve_foreign_key + an "internal operations tables"
FK-exemption primitive) that reconcile shadow-table swaps with FK
references, and imports (single-bulk snapshot + binlog tail, instead of
the alternating snapshot+tail VReplication does everywhere else, because
with cascading FKs you can't selectively skip-ahead-to-copied-rows
anymore). The post closes with a rigorous-testing section canonicalising
an application-oriented e2e stress-test suite that strips FKs on the
replica to differentially detect any remaining InnoDB-hidden cascades,
and a fuzzer harness that generated four illustrative bugs named
verbatim in the post. Foreign keys are currently single-shard-only
— cross-shard and even shard-scoped-in-multi-shard support is roadmap.
Key takeaways¶
- InnoDB cascades are invisible to the binlog — load-bearing
architectural constraint. When a
DELETE/UPDATEon a parent table firesON DELETE CASCADE/ON UPDATE CASCADE/ON DELETE SET NULL/ON UPDATE SET NULLon child rows, InnoDB applies the child-row changes internally, and "these changes to the child are done internally in InnoDB, and are never logged to the binary log". Any CDC tool, replica, or snapshot-plus-catchup-replication workflow reading the binlog is missing data for cascaded child changes. This single property cascades (sorry) into Online DDL incompatibility, Database Imports incompatibility, and the architectural decision to own FK logic in Vitess rather than patch InnoDB (see concepts/innodb-silent-cascade-in-binlog). - Vitess now owns foreign-key logic at VTGate — rather than
delegate to MySQL's native enforcement. For
DELETE FROM parent WHERE id=7with anON DELETE CASCADEchild, VTGate issues the recursive orchestration:SELECT col FROM parent WHERE id=7 FOR UPDATEto lock parent rows;DELETE FROM child WHERE col IN (parent_column_values)first (so the child deletes are in the binlog); thenDELETE FROM parent WHERE id=7. By the time InnoDB processes the parent delete, there are no child rows to cascade. All in one transaction. Canonicalised as patterns/application-level-cascade-orchestration. Trade-off vs the patch-InnoDB alternative: more locking, more back-and-forth between VTGate and MySQL — there is a "non-zero performance impact to using foreign key constraints with Vitess/PlanetScale". ON UPDATE CASCADErequiresFOREIGN_KEY_CHECKS=0— because Vitess must update the child before the parent (to capture cascaded child updates in the binlog), but the pre-update child row doesn't match any parent row with the new key value. So Vitess turns off MySQL's FK checking for the update and becomes responsible for validating any otherON UPDATE RESTRICTconstraints the grandchild may have. Example from the post:UPDATE parent SET id=8 WHERE id=7with a cascading child and aRESTRICTgrandchild. Running withFOREIGN_KEY_CHECKS=0means MySQL skips all FK validation for the statement — even grandchild RESTRICTs that must still hold (see concepts/vitess-foreign-key-enforcement).- Two MySQL server-level patches ship in PlanetScale's MySQL
fork. (a)
rename_table_preserve_foreign_key=1— makesRENAME TABLE parent TO parent_newpreserve the child's FK pinning to the nameparentrather than following the table to its new name. Without this, shadow-table swap would leave every child's FK pointing atparent_old(see concepts/innodb-parent-table-rename-pinning). (b) "Internal operations tables" — a new MySQL-fork primitive that marks specific tables as FK-exempt; InnoDB skips all FK validation + cascading for those tables, "as if weSET FOREIGN_KEY_CHECKS=0for specific tables, even while the transaction otherwise applies foreign key logic to other tables." Makes shadow tables invisible to the production parent/child FK graph during backfill + catch-up (see concepts/innodb-internal-operations-table). (Source: thebb777e3ecommit inplanetscale/mysql-serveris named inline — first canonical wiki citation of this fork.) - Database Imports switch from alternating snapshot+catch-up to single-bulk-snapshot-then-tail. External source MySQL instances aren't running the Vitess FK logic, so their binlogs still have the InnoDB cascade holes. To avoid applying cascades against rows that may not yet be on the target, VReplication falls back to one big snapshot, then tail the binlog (analogous to a MySQL point-in-time recovery: full restore + sequential binlog replay) rather than the normal per-table interleaved copy+tail pattern. All replayed events then operate on rows that already exist on the target, so the InnoDB cascade-on-the-target-replica does the right thing by itself. Imports can't compose with Online DDL for FK tables on this path.
- FK-aware schema
diff+ deploy-path computation. Supporting FK DDL syntax is easy; the hard part is that PlanetScale's deploy-request not only shows the diff, it computes a valid sequence of steps that transforms base into head while the schema is valid at every intermediate step. FKs impose ordering (create parent before child; index parent's referenced column before adding the FK on child) and block concurrency in some cases (migratingt2that referencest1can't start untilt1's migration completes ift1's new column is the one being referenced). All evaluated in-memory at deploy-request creation time viaschemadiff(Vitess Go library). - Reverts are allowed-with-orphan-row-warning, not forbidden.
If you
DROPachildtable,DELETE FROM parent, then revert the drop — the restoredchildtable's rows now have no matching parent rows. PlanetScale surfaces this as a "change may not be revertible" warning. The schema will be fine; orphaned rows are possible. Same applies to destroying or editing a constraint. - Testing is differential-on-the-replica and differential-
against-MySQL. The e2e test suite creates a four-table FK
hierarchy (parent / two children / one grandchild), uses MySQL
replication with FKs stripped off every replica via
ALTER TABLE ... DROP FOREIGN KEY, then runs a high-contention random-write app against Vitess. If Vitess lets InnoDB do any cascading on the primary, those cascades won't reach the replica — which has no FK knowledge — and primary/replica drift. Drift → test failure. Clean replication + matching row-counts on both = Vitess did all the cascade work explicitly. The fuzzer suite pits Vitess against a standalone MySQL on identical schemas and queries and flags any result divergence; canonicalises patterns/mysql-compatible-differential-fuzzing at a new altitude — this fuzzer targets DML + FK semantics, not planner output (the Arvind Murty 2023 summer-intern planner fuzzer). - Four illustrative fuzzer-found bugs. (a)
CASCADEgrandparent +CASCADEparent +RESTRICTgrandchild with a no-op update: Vitess was running a validationJOINthat rejected the update even when MySQL would allow it, because "the row isn't actually changing" — MySQL doesn't treat a null-update as a cascade trigger. Fix: addAND child.col NOT IN (...)to exclude unchanged rows from the validation. (b) Arithmetic-on-varchar-column producing-0of typeFLOAT: Vitess's pre-cascadeSELECT id, col, col * (col - (col)) FROM parentreturned-0, then issuedUPDATE child SET col = -0 WHERE col IN ('-5'). MySQL normalised-0and0as equal, dropping the cascade and leaving parent/child inconsistent. Fix: cast to the column's declared type (CAST(... AS CHAR)). (c)REPLACE INTOmissing gap-locks onSELECT-for-cascade: on a unique index, the cascadeSELECT col FROM t WHERE col IN (5) OR id IN (3) FOR UPDATEdidn't acquire gap locks, so a concurrent session could insert a matching row between theSELECTand the subsequent cascade action. Fix: promote to aNOWAITlock to fail-fast instead of blocking indefinitely; canonicalised as patterns/nowait-lock-for-cascade-select. (d) Cyclic FK references prohibited: between different tables, not allowed (self-reference is fine). See concepts/cyclic-foreign-key-prohibition. - Current limitations, verbatim. (a) Cyclic FKs between different tables prohibited; self-referencing tables OK. (b) FK constraint names change on every deployment because "constraint names must be unique to the schema" (ANSI SQL), and Online DDL creates a shadow table whose same-named FK constraint would collide. (c) FKs currently only supported in unsharded environments. Shard-scoped and cross-shard on the roadmap, motivated by the same "Vitess-owns-FK-logic" decision that already stands up locally. (d) Some revert scenarios create orphaned rows (see item 7).
Systems extracted¶
- systems/vitess — gains FK-aware planner + cascade orchestration.
- systems/mysql — the underlying engine whose FK semantics Vitess must reproduce + work around.
- systems/innodb — the storage engine doing all the actual FK work in vanilla MySQL, and whose binlog-cascades gap forces Vitess to own the logic.
- systems/vtgate — the VTGate planner grows FK-aware execution planning (locking, ordering, recursion).
- systems/vitess-vreplication — Database Imports flow forks snapshot+tail sequencing when FKs with cascading actions are present.
- systems/vitess-schemadiff — schema-path computation becomes FK-aware (ordering constraints, concurrency blocks, deploy eligibility).
- systems/planetscale-mysql-server-fork — new first-class wiki
page; PlanetScale's MySQL fork at
planetscale/mysql-server, canonically cited via thebb777e3epatch that addsrename_table_preserve_foreign_key.
Concepts extracted¶
- concepts/foreign-key-constraint — the SQL primitive; the wiki's
first dedicated home canonicalising MySQL-specific FK rules
(referenced columns must be indexed-in-order; type-matching caveats
including the
VARCHAR(32)vsVARCHAR(64)compatibility;FOREIGN_KEY_CHECKS=1as the enforcement switch). - concepts/innodb-silent-cascade-in-binlog — the load-bearing MySQL limitation behind the entire post.
- concepts/vitess-foreign-key-enforcement — Vitess's
application-level FK semantics on top of MySQL with
FOREIGN_KEY_CHECKS=0-on-cascades. - concepts/innodb-internal-operations-table — the PlanetScale MySQL-fork primitive exempting specific tables from FK validation.
- concepts/innodb-parent-table-rename-pinning — the
rename_table_preserve_foreign_keyserver variable semantics. - concepts/cyclic-foreign-key-prohibition — the self-reference-OK / cross-table-cycle-not-OK rule in Vitess FK support.
Patterns extracted¶
- patterns/application-level-cascade-orchestration — the
lock-select / delete-children-first / delete-parent pattern
Vitess uses for
ON DELETE CASCADE. - patterns/nowait-lock-for-cascade-select — the
SELECT ... FOR UPDATE NOWAITfix for the missing-gap-lock class of bug.
Operational numbers¶
- Around one year from initial deep-dive to ship ("around a year in fact").
- Cascade orchestration involves one extra
SELECT ... FOR UPDATEper level of FK depth — linear in tree depth, not breadth. - Fuzzer test schema: 20 tables with mixed FK relationships to cover the DML surface.
- E2E test schema: 4 tables (1 parent-only + 2 child-only + 1 parent-and-child); generates high-contention writes across all 4
- an Online DDL run on top of each.
- Current scope: single-shard only; multi-shard + cross-shard roadmap.
Caveats¶
- Post is mechanism-disclosure altitude, not benchmark altitude. No production throughput numbers disclosed for FK-on vs FK-off. The qualitative framing is "non-zero performance impact".
- Cyclic-FK prohibition was not deeply explained — the post states it as a current limitation without walking through why (likely: deadlock surface in the cascade-orchestration pattern, or unbounded recursion in the validation plan; both speculation on our part, not stated).
- Reverts with orphaned rows stated as a current platform choice (warn + allow) rather than a forced-error — a design trade-off vs forbidding destructive-and-non-revertible paths outright. Comparable framing with 2024's schema reverts post which canonicalises inverse-replication as the usual revert path; this FK post surfaces the schema-shape where inverse-replication alone can't restore the lost referential invariant.
- Not a cross-shard FK post. Every architectural choice in this post (Vitess owns the logic; InnoDB binlog-cascade-gap is the enemy; MySQL-fork patches bound the shadow-table surface) prepares for cross-shard FK support but doesn't implement it.
Source¶
- Original: https://planetscale.com/blog/challenges-of-supporting-foreign-key-constraints
- Raw markdown:
raw/planetscale/2026-04-21-the-challenges-of-supporting-foreign-key-constraints-ae432d34.md
Related¶
- systems/vitess
- systems/mysql
- systems/innodb
- systems/vtgate
- systems/vitess-vreplication
- systems/vitess-schemadiff
- systems/planetscale-mysql-server-fork
- systems/planetscale
- concepts/foreign-key-constraint
- concepts/innodb-silent-cascade-in-binlog
- concepts/vitess-foreign-key-enforcement
- concepts/innodb-internal-operations-table
- concepts/innodb-parent-table-rename-pinning
- concepts/cyclic-foreign-key-prohibition
- concepts/binlog-replication
- concepts/change-data-capture
- concepts/shadow-table
- concepts/cutover-freeze-point
- concepts/gap-locking
- concepts/online-ddl
- concepts/deploy-request
- concepts/schema-three-way-merge
- concepts/schema-change-deploy-order
- concepts/non-revertible-schema-change
- patterns/application-level-cascade-orchestration
- patterns/nowait-lock-for-cascade-select
- patterns/shadow-table-online-schema-change
- patterns/snapshot-plus-catchup-replication
- patterns/mysql-compatible-differential-fuzzing
- patterns/three-way-merge-for-schema-changes
- companies/planetscale