Skip to content

PATTERN Cited by 1 source

Branch-based schema-change workflow

Problem

Production schema changes are two hard things entangled:

  1. Authoring — iterating on column shapes, index choices, type changes. This work benefits from speed, isolation, and permission to break things.
  2. Applying — the actual cutover against a live, user-facing database. This work requires safety, auditability, and zero impact on production traffic.

Naively merging these — editing schema directly against production, or running hand-crafted ALTER TABLE statements after a DBA review ticket — produces a well-documented tax:

  • Weeks of latency on DBA review pipelines ("opening tickets to get a DBA review for each change. This can take weeks" — Burns, 2021).
  • Columns intentionally avoided because the team knows the migration won't fit in a maintenance window ("engineering teams won't change or update certain columns in their production databases because the migration will take too long").
  • Schema shapes distorted to avoid migrations ("others have told us about how they turned columns in their relational databases into JSON stores, just to avoid schema migrations").
  • Concurrent-change races — team A's ALTER and team B's ALTER discovered to conflict only at cutover, hours into the migration.
  • Noisy-neighbor migrations — a migration copying a terabyte table saturating disk I/O and taking down OLTP latency.

The underlying cause: there is no workflow primitive that cleanly separates authoring from applying, and no ahead-of-time safety analysis before the costly migration work begins.

Solution

Treat schema changes like code changes. Put a git-like branch between the developer and production, require a pull-request-like deploy request for merge, run a three-check safety pipeline before queue admission, serialise queued requests through a one-at-a-time deploy queue, and run the actual migration with a traffic-aware throttler that yields to production OLTP.

Canonical statement of the pattern (Burns, PlanetScale 2021):

"We want developers to push schema changes as easily, and as often, as they push code changes."

(Source: sources/2026-04-21-planetscale-non-blocking-schema-changes.)

Three enumerated pillars, verbatim:

  • "Allows users to test out schema changes on a branch that is isolated"
  • "Analyzes schema changes in advance to ensure there are no conflicts"
  • "Deploys schema changes in the background without impact to production"

The end-to-end workflow

Step 1 — Branch

Developer creates a database branch off production. The branch gets an automatic deployment of a copy of the production schema — isolated from live traffic. Developer edits schema on the branch, tests applications against it.

Step 2 — Deploy request (optionally with review)

Developer opens a deploy request against production. Two paths:

  • With review: a teammate reviews the schema change (later extended by the PR-bot pattern).
  • Direct to queue: developer skips review and adds the deploy request directly to the deploy queue.

Step 3 — Pre-flight conflict check (admission

control)

The platform runs a three-check pre-flight schema-conflict check:

  1. Branch-HEAD vs. main-HEAD-at-fork-time.
  2. Branch-HEAD vs. main-HEAD-now (main may have drifted since fork).
  3. Branch-HEAD vs. queued-ahead deploy requests.

Failure on any check → the deploy request is rejected and the user notified. This rejection happens at queue admission, not at cutover — saving "up to a few days" on long-running migrations (Burns 2021).

Step 4 — Serialised deploy queue

Accepted deploy requests enter a per-target schema-change queue and run one at a time in submission order. Burns's 2021 disclosure: "In our experience, deploying schema changes one at a time is generally more efficient than running them concurrently, with a few exceptions."

Step 5 — Non-blocking migration with traffic-aware

throttling

When a deploy request reaches the head of the queue, the platform runs a [[patterns/shadow-table-online- schema-change|shadow-table online schema change]] via gh-ost (2021-era; later VReplication) orchestrated by Vitess. The migration is traffic- aware: if production traffic spikes, the migration scales down to avoid contending for resources. The operator-visible promise: the migration runs "in the background without impact to production."

The three pillars in plain-English

Pillar Principle Wiki page
Isolated authoring Branch off prod schema concepts/database-branching
Ahead-of-time safety Reject conflicts before cutover concepts/pre-flight-schema-conflict-check
Invisible application Migration yields to OLTP concepts/traffic-aware-migration-throttling

Comparison with prior art

Tool Schema versioning Branching Pre-flight conflict Non-blocking exec Traffic-aware Managed workflow
Liquibase
Flyway
pt-online-schema-change partial
gh-ost ✓ (via flags)
PlanetScale (2021) ✓ (via gh-ost)

Burns's 2021 positioning (verbatim):

"Both pt-online-schema-change and gh-ost offer online or non-blocking schema changes (schema change migrations that don't lock tables while being deployed). They do so by creating a new table that is a copy of the given table. The schema changes are applied to the new table and the data in the original table is copied over. Once that is complete, the original table is replaced by the new table. However, these tools are often run manually and require the support of additional infrastructure."

The pattern's contribution is integration, not novel mechanism: wrap gh-ost's proven migration engine inside a developer-facing workflow that removes the manual operational overhead.

When to apply

  • Multi-team engineering orgs doing frequent schema changes against shared databases.
  • Workloads where migration duration is a first-order operational constraint (petabyte-scale tables, migrations running for days).
  • Teams currently avoiding schema changes because of their operational cost — the pattern removes the friction so schema shape can follow application evolution.

When not to apply

  • Single-developer side projects — the workflow overhead dwarfs the gain.
  • Workloads with maintenance windows available and low enough migration volume that manual coordination works fine.
  • Schemas that never change (rare, but they exist — lookup tables, reference data).

Seen in

Last updated · 378 distilled / 1,213 read