PlanetScale — Behind the scenes: How schema reverts work¶
Summary¶
Holly Guevara and Shlomi Noach (PlanetScale) describe how
PlanetScale turns a completed online schema change into a
reversible one: the user can click "Revert changes" after
deployment and get the old schema back instantly, without
losing any rows written during the window the new schema was
live. The trick is not a fancier rollback — it is an
architectural property of how Vitess runs online schema changes
in the first place. Most online-DDL tools (pt-online-schema-change,
gh-ost) tear down the shadow table and replication stream as
soon as the cut-over completes; PlanetScale keeps both alive.
At cut-over the old production table becomes the new shadow
table, and a VReplication stream
is immediately primed in the opposite direction (new → old)
so the old schema stays current with every post-cut-over write.
A revert is then the same atomic
cut-over swap as the original migration, just played backwards —
no data copy, no catch-up wait, no row loss. The mechanism rests on
two VReplication properties that PlanetScale calls out as unique
in the online-DDL world: transactionally accurate journalling
of migration state against MySQL GTID
positions, and non-termination after cut-over so the stream
is still there to be inverted.
Key takeaways¶
-
All online schema-change tools follow the same four-step shape — build a shadow table with the new schema, copy existing rows, track concurrent writes, cut over. Quoting directly: "In short, online schema change tools copy the production table without data, apply the schema change to the copy, sync the data, and swap the tables." Canonical new patterns/shadow-table-online-schema-change pattern. The shadow table is the first-class primitive of the family — originally empty, schema-evolved, and filled concurrently with live traffic until it catches up with the production table. (Source: sources/2026-04-21-planetscale-behind-the-scenes-how-schema-reverts-work)
-
VReplication's distinguishing properties make revert possible. The post names five VReplication design choices that separate it from tools like
pt-online-schema-changeandgh-ost: (a) it tracks both the initial-data backfill and the ongoing-change stream rather than backfill only; (b) it maps every transaction to a MySQL GTID so copy progress and change-log progress share one ordering; (c) it interleaves copy and change-log catch-up by GTID-set comparison rather than by wall-clock polling; (d) "it couples copy state and its progress transactionally. Likewise, it couples changelog events and their progress transactionally" — sidecar state advances in the same commit as the destination write, so crash-recovery restarts from exactly-where-it-left-off; and crucially (e) "Unlike any other schema change solution, Vitess does not terminate upon migration completion" — the stream and the now-former-production table stay hot after cut-over and become the revert substrate. -
The cut-over is the only write-locked moment; everything else is non-blocking. "The cut-over is the single step where a write lock is explicitly imposed on the table. Until the swap is complete, no writes can take place. It's the 'freeze point', where both tables are in perfect sync." The proxy layer still accepts client writes during the freeze — VTGate buffers them — so the application sees a small latency spike rather than errors. Canonical new concepts/cutover-freeze-point concept: the brief, mandatory, server-side write-freeze at exactly the moment the two tables must be declared equivalent. "The Vitess migration flow marks the database position at that freeze point. It then swaps the two tables: the shadow table replaces the original table, and the original table replaces the shadow table."
-
After cut-over, the old table becomes a pre-staged shadow for the reverse direction — "open-ended revert" runs in the background regardless of whether the user ever clicks Revert. "Shortly after migration completion, PlanetScale prepares an open-ended revert. The revert process tracks ongoing changes to the table and applies them to a shadow table. That should sound familiar. Indeed, we already have a shadow table in place. It is already populated with data, and we know that it was in full sync with what we now call the new table at cut-over time." Canonical new concepts/pre-staged-inverse-replication concept + patterns/instant-schema-revert-via-inverse-replication pattern. The old table has the old schema; VReplication filters incoming writes through the new → old projection and applies them to keep the old table current with every post-cut-over write.
-
Revert is the same cut-over sequence run backwards — no data copy. "So once you click revert, all we need to do is swap them again! It goes through the exact same cut-over process, and the shadow table becomes the production table again and vice versa." The revert is not a backup restore, not a point-in-time replay, not a copy operation — it is the atomic table-swap under a brief VTGate query-buffer, run on a shadow table that has been kept up-to-date by VReplication from the moment the migration completed. Because every post-cut-over write has already been replicated into the shadow, "With this process, you retained any new data that was added during that period, which would have been a huge hassle in traditional rollback and restore methods."
-
Revert preserves writes that can only live in the OLD schema, and accepts the obvious corollary. Walked example: a
userstable has atitlecolumn; the deploy runsALTER TABLE users DROP COLUMN title; new rows (like "Savannah") are written to the post-ALTER schema without atitle. On revert thetitlecolumn is restored for everyone, and Savannah exists but has no title — "this is because that entry was added after the tables were swapped, so thetitlecolumn didn't exist in production. This is expected and something you can clean up after the revert, if necessary." The property being promised is strictly stronger than other rollback mechanisms (no row loss; the new rows survive), but weaker than time travel (information that was only representable in the new schema is simply absent in the old one). A useful contrast with expand-migrate-contract: PlanetScale's revert is the "oh no" escape hatch for teams that shipped a destructive schema change; expand-migrate- contract is the deliberate way to avoid needing one. -
The mechanism generalises beyond
ALTERbut the blog teases it rather than documenting it. Noach/Guevara note the current post walksALTER;CREATEandDROPreverts require "more nuances" left for a future post. Practically this means the VReplication inversion trick described here — swap tables, invert the stream — doesn't compose as cleanly when the forward migration was "this table didn't exist before; now it does" (where's the shadow?) or "this table existed before; now it doesn't" (where's the source?). The architectural principle is the same; the physical plumbing is different.
Extracted systems¶
- systems/planetscale — the managed-database product surfacing the "Revert changes" button this post describes. Feature enrolled via the database Settings page ("limited beta" at post time) and exercised from the deploy-requests UI.
- systems/vitess — the database layer under PlanetScale on which the migration-revert mechanism is implemented; the post's framing of VReplication uniqueness is implicitly a framing of Vitess's uniqueness among MySQL-compatible sharding substrates.
- VReplication — the substrate. "At PlanetScale, we leverage the power of Vitess' VReplication internals to run online schema changes." Core design choices (GTID-precision transaction mapping, transactional state journalling, non-termination after cut-over) are what make revert possible.
- systems/mysql — the engine the technique ultimately
rests on. MySQL-specific primitives named explicitly:
START TRANSACTION WITH CONSISTENT SNAPSHOTfor the snapshot half, GTID for the replication-position half.
Extracted concepts¶
- concepts/shadow-table (new canonical concept) — the first-class primitive of every online-DDL tool in the family. Empty-schema-evolved copy of production; filled by the tool concurrently with live writes; swapped with the original at cut-over. Worth a dedicated page because the term gets referenced across all online-DDL, ghosted-migration, and database-revert content on the wiki.
- concepts/cutover-freeze-point (new canonical concept) — the single write-locked moment in an online schema change, at which the source table's GTID position is recorded and the shadow is declared equivalent. Distinct from VTGate's client-facing query buffer — the freeze point is server-side (the MySQL write lock on the table); the query buffer is proxy-side (client connections see a latency spike, not errors).
- concepts/pre-staged-inverse-replication (new canonical concept) — the unique-to-Vitess architectural property that after cut-over the old table and the VReplication stream stay alive, the stream is immediately re-primed in the opposite direction, and the old table becomes a hot inverse shadow that never needs a data-copy phase to service a revert.
- concepts/gtid-position — the portable-position primitive that lets the freeze-point GTID serve as the "tables are equivalent" reference both for the forward swap and for any subsequent inverse replication stream.
- concepts/consistent-non-locking-snapshot — the
START TRANSACTION WITH CONSISTENT SNAPSHOTtechnique the post names as the initial-copy mechanism: "we runSTART TRANSACTION WITH CONSISTENT SNAPSHOT, which takes that snapshot and essentially freezes time while we copy the rows over." - concepts/online-ddl — the family this work sits in.
- concepts/binlog-replication — the MySQL change-log substrate VReplication tails to keep the shadow caught up and, post-cut-over, to keep the inverse shadow caught up.
Extracted patterns¶
- patterns/shadow-table-online-schema-change (new) —
the canonical four-step online-DDL shape: build empty shadow
with new schema, apply DDL to shadow, backfill + track
changes to keep shadow in sync, cut over under a brief
table-level write lock. The pattern is engine-agnostic —
pt-online-schema-change,gh-ost, and Vitess VReplication-driven DDL all instantiate it, and the differences between them are about how rigorously they track state and whether they survive cut-over. - patterns/instant-schema-revert-via-inverse-replication (new) — PlanetScale's distinctive extension. At cut-over, keep the old table and the VReplication stream alive; immediately create an inverse VReplication stream (new → old); when the user requests a revert, swap the tables again (same cut-over, reversed direction). Cost: one extra continuous replication stream + one extra table for the retention window. Value: instant, data-preserving revert of a destructive DDL.
- patterns/snapshot-plus-catchup-replication — the load-bearing data-motion primitive underneath. The post re-states the snapshot+catchup shape in the schema-change context and explicitly maps copy-phase / change-log-phase interleaving to GTID-set comparison.
Architectural diagram (from the post)¶
The post walks the process with 10 screenshots of the following shape:
- Copy schema only — empty shadow table created from
users, including thetitlecolumn. - Apply DDL to shadow —
ALTER TABLE users DROP COLUMN titleapplied only to the shadow. - Begin copying rows — row-by-row backfill from production to shadow under consistent snapshot; production is still taking writes.
- Continue copying + track incoming changes — the post explicitly calls out the tension: a row copied to the shadow can be updated in production seconds later, so the shadow catches up through the binlog.
- Snapshot-and-capture — run
START TRANSACTION WITH CONSISTENT SNAPSHOT; record the GTID; copy the batch. - Switch to change-log phase — tail binlog events that (a) occurred after the captured GTID and (b) touch rows already copied.
- Switch back to copy phase — capture a new GTID; copy the next batch; loop until source is exhausted.
- Cut-over freeze — hard stop; write lock on the source table; freeze-point GTID recorded.
- Post-cut-over inverse replication — the old table
(now shadow) continues being fed by the VReplication
stream in the reverse direction. Diagram 9 shows a
"Savannah" row arriving on the new table and replicating
back to the old one (minus the
titlecolumn the old schema has but the new doesn't — wait, old has title, new doesn't; so the replication handles the column projection going backwards — any column the old schema has that the new schema lacks becomes NULL/default on revert). - Revert swap — atomic table swap driven by VReplication. Same cut-over sequence as the forward migration, same brief query-buffer at VTGate, completes in a click.
Operational numbers and claims¶
- "All of this is done in seconds, behind the scenes, with just a click of a button." — PlanetScale's revert SLA framing. Order-of-magnitude claim rather than a documented p99.
- Cut-over itself is a brief write lock + table swap; the
claim is that "writes will still be allowed from the
application's perspective" because VTGate holds them.
Duration figure not published in this post (the companion
zero-downtime migrations at petabyte scale post puts the
analogous
MoveTables SwitchTrafficcut-over at "less than 1 second.") - The post does not publish retention-window numbers for how long the inverse-replication workflow is kept alive post- cut-over, i.e. the revert horizon.
Caveats and unknowns¶
- Scope of "revert." The post describes
ALTER-column revert.CREATEandDROPreverts are acknowledged as needing different plumbing and are deferred to a future post. This wiki's patterns/instant-schema-revert-via-inverse-replication pattern is documented in theALTERshape only. - How long is the revert window? Not stated. The inverse-replication stream is running "regardless of if you eventually click revert or not," but the post doesn't commit to a retention period. Operationally there must be one — a running replication stream is not free.
- Column-projection semantics on revert. The worked
example (
DROP COLUMN title) shows the forward direction dropping a column the application no longer references. On revert, the column reappears and rows added during the new-schema-window have no value for it. The post names this explicitly: "Savannah doesn't have a title. This is because that entry was added after the tables were swapped, so thetitlecolumn didn't exist in production. This is expected and something you can clean up after the revert, if necessary." The generalisation (what happens on revert forADD COLUMN?MODIFY COLUMN? charset changes?) is not documented. - Publication date ambiguity. Frontmatter
published:is2026-04-21(the re-fetch), but the body byline reads "March 24, 2022." This is an older Shlomi Noach post re-fetched by PlanetScale's blog infrastructure; the architectural content is still current because VReplication's post-cut-over non-termination property has not been deprecated. - Dependency on Vitess. Non-Vitess MySQL-compatible
platforms (RDS MySQL, Aurora MySQL, direct MySQL with
pt-online-schema-change/gh-ost) cannot instantiate this revert pattern — the tools teardown their shadow table at cut-over and do not keep a change-log stream alive afterwards. This is a PlanetScale-specific capability built on a Vitess-specific VReplication property.
Source¶
- Original: https://planetscale.com/blog/behind-the-scenes-how-schema-reverts-work
- Raw markdown:
raw/planetscale/2026-04-21-behind-the-scenes-how-schema-reverts-work-f3dbb4e0.md
Related¶
- systems/planetscale
- systems/vitess
- systems/vitess-vreplication
- systems/mysql
- concepts/shadow-table
- concepts/cutover-freeze-point
- concepts/pre-staged-inverse-replication
- concepts/gtid-position
- concepts/consistent-non-locking-snapshot
- concepts/online-ddl
- concepts/binlog-replication
- concepts/query-buffering-cutover
- patterns/shadow-table-online-schema-change
- patterns/instant-schema-revert-via-inverse-replication
- patterns/snapshot-plus-catchup-replication
- patterns/reverse-replication-for-rollback
- patterns/expand-migrate-contract
- companies/planetscale