Skip to content

PATTERN Cited by 2 sources

Near-atomic multi-change deployment

Problem

Traditional multi-ALTER deployment on MySQL has a nasty shape: the engine serialises DDL, and individual ALTERs on large tables take hours. Deploying three 8-hour changes sequentially means 24 hours of partially-deployed schema. During that window:

  • The schema doesn't match source control.
  • Incident response has no clean abort path — completed migrations are irreversible without authoring new DDL.
  • Every new deploy-request stacked on top has ambiguous base state.
  • Application behaviour is subtly different as each migration lands (some columns exist, some don't; some views reflect new definitions, some don't).

The problem is to deploy N schema changes as a single unit that completes all-together (or not-at-all) in an application-visible event measured in seconds, not hours.

Solution

Treat the set of schema changes in a deploy-request as a single deployment unit. Run every long-running migration's copy + catch-up phase in parallel, holding each in staged-then-sealed state. When every migration reports ready-to-complete, seal them all together — cut over each in rapid sequence (a few seconds apart), preceded by the immediate DDL changes that were deferred to the end.

The application-visible result is one cut-over event per migration but all within seconds — operationally indistinguishable from an atomic multi-table schema transaction, even though MySQL provides no such primitive.

Canonical verbatim shape (Source: sources/2026-04-21-planetscale-deploying-multiple-schema-changes-at-once):

"PlanetScale deploys your entire set of changes near-atomically, which means the production database schema remains stable throughout the deployment process and changes almost all at once when all changes are ready."

Mechanism

The pattern composes four building blocks:

  1. schemadiff analysis phase (cost: milliseconds to seconds):
  2. Parse current-production vs target-branch schemas.
  3. Compute the diff DDL.
  4. Partition diffs into equivalence classes.
  5. Within each class, find a valid execution permutation by in- memory validity checking.

  6. Shadow- table online schema change + staging:

  7. For each long-running migration, create the shadow, backfill under consistent snapshots, tail the binlog.
  8. Hold each migration in catch-up state indefinitely (staged-then- sealed).
  9. Publish a ready flag per migration to the deploy controller.

  10. Gate coordination:

  11. Deploy controller polls ready flags.
  12. When every flag is ready, the gate opens.
  13. During staging, the deployment is fully cancellable with zero production impact.

  14. Near-atomic seal:

  15. Apply cut-overs to every long-running migration in the pre-computed order (a few seconds apart).
  16. Apply the immediate DDL (CREATE TABLE, ALTER VIEW) at the end, respecting per-class dependency ordering from schemadiff.
  17. Stage [[patterns/instant-schema-revert-via-inverse- replication|inverse replication]] streams on every migration for the 30-minute revert window.

When to use

Use near-atomic multi-change deployment when:

  • Multiple schema changes are semantically one feature (app code depends on all of them together).
  • Individual migrations take long enough that sequential apply creates an operationally painful window (hours to days).
  • Full cancellation-ability during staging has operational value (incident response, design iteration).
  • The reverse-order revert
  • 30-minute window is enough rollback flexibility (vs forever-undoable via new forward migrations).

Do NOT use it for:

  • Single schema changes. One migration has no cross- migration coordination to exploit. Standard Online DDL is simpler.
  • Schema changes that must be permanent immediately. The 30-minute revert window adds resource cost (inverse replication streams, storage). For schema changes that genuinely need to be irreversible immediately, the revert-window primitive is cost without benefit.
  • Hundred-table branches. Resources are not infinite. The post warns: "Altering a hundred tables in one deployment request is not feasible and possibly not the best utilization of database branching." schemadiff will refuse branches exceeding the "reliably safe path" threshold. Decompose into multiple smaller deploy-requests.

Trade-offs

  • Resource amplification during staging. Every staged migration holds a full shadow table; binlog tailing accumulates; schemadiff's in-memory validation scales with graph complexity. For N migrations on M-sized tables, storage cost during staging is O(N × M).
  • Gate-readiness timing is a coordination problem. The longest-running migration gates the entire deployment. The deploy controller must handle per-migration failure (a single stuck migration blocks the gate) — the post doesn't disclose the failure-recovery semantics.
  • Cannot deliver true atomicity. The cut-over is "a few seconds apart," not instantaneous. An incident during the seal window could leave some migrations completed and some not. MySQL's lack of transactional DDL is the hard constraint; this pattern minimises the exposure, doesn't eliminate it.
  • Revert window resource cost. Pre-staged inverse replication streams for 30 minutes after seal × N migrations is continuously-running work. Worth it for the operational property, but is not free.

Composes with

Canonical implementation

PlanetScale's deploy-request system on MySQL + Vitess is the canonical wiki instance. The Shlomi Noach post (2023-08-29) is the public-facing architectural description; the Vitess 21 release notes (2026) confirm the mechanism continues to evolve (more INSTANT DDL analysis, charset handling, schemadiff capabilities).

No non-Vitess implementation is disclosed on the wiki as of 2026-04-23. Atlas CLI's declarative apply and Skeema / Bytebase GitOps-style schema-change platforms handle single-step apply but don't provide the gated-multi-migration coordination layer — they operate as clients of the underlying engine's single-DDL semantics, accepting the sequential-apply cost model.

Seen in

  • sources/2026-04-21-planetscale-gated-deployments-addressing-the-complexity-of-schema-deployments-at-scale — earliest canonical wiki disclosure of the pattern. Shlomi Noach's 2022-09-06 Gated Deployments launch post introduces the deploy-unit framing (multi-change + multi-shard dimensions) and the app-facing property ("the deployment can be considered more atomic; up till the final stage, no change is reflected in production"). The 2022 post also canonicalises the scheduling rule "we run as much of the bulk work as possible upfront, sequentially, and then run the more lightweight work in parallel" — sequential copy + parallel tail is named at the launch-post altitude; the 2023 successor formalises it as interleaved-copy- phases. The 2022 post extends this pattern with the multi-shard and operator- scheduled-cutover dimensions.
  • sources/2026-04-21-planetscale-deploying-multiple-schema-changes-at-once — canonical wiki first disclosure. Shlomi Noach introduces the pattern end-to-end: the 8-hour / 24-hour sequential-apply complaint, the copy-and-swap emulation as the enabling primitive, the equivalence-class partition for dependency resolution, the staged-then- sealed shape, the near-atomic cut-over window, the 30-minute reverse-order revert window. The architectural load-bearer for every PlanetScale multi-migration deploy-request.
Last updated · 347 distilled / 1,201 read