Skip to content

PATTERN Cited by 1 source

Instant schema revert via inverse replication

Problem

A destructive online schema change (e.g. ALTER TABLE users DROP COLUMN title) has completed, cut over, and is running in production. Minutes later — or hours, or a day — the team discovers a regression: an unmigrated reader needs the dropped column; a downstream job breaks; an intern shipped the wrong DDL. The options on a normal MySQL stack are all bad:

  • Run the compensating DDL (ALTER TABLE users ADD COLUMN title VARCHAR(...)). Restores the schema. Does not restore the data — every row written after the forward cut-over now has no value in the restored column because those writes only ever knew about the new schema.
  • Restore from backup. Restores both schema and data, but discards every write that landed after the backup point. For a table taking thousands of writes per second, that can be a multi-hour data-loss window.
  • Reconcile manually. Custom script to re-derive the dropped column's values from other columns or upstream sources. Hours of engineering time, per incident.

The architectural problem: the forward cut-over was a one-way door. Everything built on top of the new schema has to either succeed on the new schema or reckon with data loss.

Solution

At forward cut-over, don't tear down the shadow table or its VReplication stream. Instead, immediately re-prime the stream in the inverse direction (new → old) and keep it running in the background regardless of whether the user ever requests a revert. The old table is now a live-updated inverse shadow; every commit to the new table is projected back through the inverse of the DDL and applied to the old table.

Revert is then the same cut-over sequence as the forward migration, played backwards:

  1. Brief write-lock on the (now-production) new table.
  2. Drain any residual inverse-replication events.
  3. Mark a new freeze-point GTID.
  4. RENAME TABLE users TO users_new, users_old_shadow TO users.
  5. Release the lock.

Because the old table has been kept continuously in sync with the new table, no data copy is needed — the swap is instant. Because VTGate buffers queries across the swap, the application sees a brief latency spike, not errors.

This is the load-bearing novelty of PlanetScale's schema-revert feature, documented by Guevara and Noach as "an open-ended revert" that is staged "shortly after migration completion … regardless of if you eventually click revert or not."

Mechanics

At forward cut-over time, the Vitess migration workflow:

  1. Completes the forward swap under the freeze point.
  2. Leaves the old table in place (renamed to a shadow-table name like _vrepl_...). Normally online- DDL tools drop the old table here.
  3. Creates an inverse VReplication stream (new → old), primed at the freeze-point GTID so no writes are lost across the transition.
  4. Applies every incoming commit on the new table to the old table through the DDL's inverse projection.
  5. Holds this state open for an operator-configurable retention horizon.

On Revert request:

  1. Run the same shadow-table cut-over sequence, swapping the new table with the old (inverse shadow).
  2. The old schema + schema-compatible data are live again, instantly.
  3. Post-revert rows written to the old schema now lack values in columns the new schema had — the inverse of the original forward-direction gap, and the same asymmetry.

Trade-offs

  • The forward schema-change workflow must not terminate. The migration runner has to stay alive past cut-over and continue running a replication stream. For Vitess's VReplication this is free — the stream is already a sidecar-state-persisting long-lived workflow, not a standalone binary. For pt-online-schema-change / gh-ost this is not free; retrofitting instant revert onto those tools means introducing a new long-lived component.
  • Two tables plus two replication streams during the revert horizon. Storage cost ~2x the table size; CPU cost ~2x the continuous-change application. Operational bet: the marginal cost of keeping the inverse stream running is small relative to the cost of a forced restore-from-backup during a revert incident.
  • Column-projection asymmetry. See concepts/pre-staged-inverse-replication for the taxonomy. DROP COLUMN reverts cleanly (post-cut-over rows reappear with NULL in the restored column); ADD COLUMN reverts lossily (post-cut-over column values are discarded); MODIFY COLUMN reverts may require a reverse-cast that isn't total; CREATE TABLE / DROP TABLE require distinct plumbing the post acknowledges but doesn't document.
  • Revert horizon is finite. The inverse stream and retained shadow cost real resources. Operators have to pick a horizon — minutes? hours? days? — that balances the cost against the probability of needing a revert that far out. The post doesn't commit to a specific horizon.

Contrast with data-motion reverse replication

The wiki also documents patterns/reverse-replication-for-rollback, which is the same underlying idea applied to data-motion migrations: MoveTables SwitchTraffic creates a reverse VReplication workflow so MoveTables ReverseTraffic can cut back without data loss. The two patterns are siblings under a single meta-pattern ("keep the inverse replication alive so cut-over is a revolving door, not a one-way door"):

patterns/reverse-replication-for-rollback Instant schema revert via inverse replication
Scope Data-motion migration Online DDL
What flips Routing rules at proxy (which keyspace) Table names at MySQL (which schema)
Cut-over primitive patterns/routing-rule-swap-cutover patterns/shadow-table-online-schema-change
Canonical command MoveTables SwitchTraffic / ReverseTraffic "Revert changes" button

Composes with

Seen in

  • sources/2026-04-21-planetscale-behind-the-scenes-how-schema-reverts-work — canonical wiki instance. Guevara + Noach: "Shortly after migration completion, PlanetScale prepares an open-ended revert. The revert process tracks ongoing changes to the table and applies them to a shadow table. That should sound familiar. Indeed, we already have a shadow table in place. It is already populated with data, and we know that it was in full sync with what we now call the new table at cut-over time. So in the previous example, once the deployment is complete, you decide you need to revert it. … So once you click revert, all we need to do is swap them again!" The post frames this as unique to Vitess among online schema-change solutions: "Unlike any other schema change solution, Vitess does not terminate upon migration completion. This point is important when it comes to reverting schema changes." Walked example: ALTER TABLE users DROP COLUMN title, post-cut-over rows (like "Savannah") survive the revert but reappear with NULL in the restored title column — "expected and something you can clean up after the revert, if necessary."
Last updated · 319 distilled / 1,201 read