PATTERN

Expand-migrate-contract (parallel change)¶

Expand-migrate-contract (also known as parallel change, backward-compatible changes, or colloquially the expand-contract pattern) is the canonical six-step discipline for changing an existing database-schema element — renaming a column/table, changing a data type, splitting or merging columns, restructuring how data is represented — without coupling the application-code deploy to the schema deploy. Each step is independently deployable and independently rollback-able; the cost is that what would be a single synchronous breaking change becomes a six-step sequence running over days to weeks.

When to use it¶

Use expand-migrate-contract when:

The schema element being changed is already in use by the production application (reads or writes).
Downtime is unacceptable (OLTP path, customer-visible queries, or cross-service contracts).
The schema change is structural (rename / type change / split / merge / reshape) rather than additive.

Do NOT use it for purely additive changes — adding a new table / column / view. For additive changes, deploying the schema change first (before the application code that uses it) is sufficient; the six-step dance is cost-without-benefit. The distinguishing test is "does this change touch schema my production application is already using?" — if no, just deploy schema-then-app.

(Source: )

The six steps¶

Step 1 — Expand¶

Add the new schema element alongside the old one. Make the addition minimally-disruptive to existing writes: a new column must be nullable or carry a safe default so existing INSERT statements that don't mention it still succeed.

ALTER TABLE repo ADD COLUMN star_count INT;

Deploy and verify: the application is unchanged, writes still go to the old schema, reads still come from the old schema. The new column / table exists but is empty / unreferenced.

Step 2 — Dual-write¶

Update the application to write the new representation to the new schema on every write, in addition to writing the old representation to the old schema. Reads still come from the old schema.

# Before
session.execute("INSERT INTO star (repo_name, star_count) VALUES (%s, %s)",
                (repo_name, count))

# After
session.execute("INSERT INTO star (repo_name, star_count) VALUES (%s, %s)",
                (repo_name, count))
session.execute("UPDATE repo SET star_count = %s WHERE repo_name = %s",
                (count, repo_name))

Deploy and verify: the application's behaviour for users is unchanged (reads still hit the old schema); new writes now populate both schemas. Confirm no regression. If the new-schema write fails, the old-schema write still succeeds and the app is unaffected — the dual-write is structurally safer than a pure switch.

Transaction discipline: in relational databases, put both writes in the same transaction so neither-or-both is the guaranteed outcome. Cross-store dual-write (e.g. relational + search index) is beyond the scope of this pattern and needs a CDC-shaped approach (concepts/change-data-capture).

Step 3 — Backfill¶

Copy historical data from the old schema to the new schema. Two shapes:

One-shot script: for small data ("there's not much data, so it is safe not to use background jobs"). Run once, verify, move on.
Background job: for large data. Spread the backfill over hours / days to avoid IOPS contention, replication lag, or lock contention on the active tables.

Include any data transformations in the backfill — if the semantic shape is changing (e.g. splitting a string column into two), the transformation lives in the backfill script, not in the live dual-write (the dual-write should already be emitting the new shape from Step 2).

Spot-check after the backfill completes. The pattern's correctness depends on backfilled rows + dual-written rows = complete new-schema state, so any gap is a silent data-correctness bug.

Step 4 — Read from new¶

Update the application to read from the new schema. This is the most sensitive step — the first deploy where user-visible behaviour depends on the new schema. Production performance monitoring is load-bearing: the new column may index differently, sit on a different shard, have different selectivity, or have unexpected null distributions.

Tools: application performance monitoring, database query profiling (e.g. PlanetScale Insights, Datadog APM, New Relic, per-query EXPLAIN comparisons).

Rollback property: at this step, the application-code can still roll back to reading from the old schema because the old schema is still fully read-complete (thanks to Step 2's dual-writes). This is the last step at which pure application-code rollback fully restores correctness.

Step 5 — Contract (stop writing to old)¶

Update the application to write only to the new schema, removing the dual-write. The old schema is now read-free and write-free but still exists as a safety margin.

Deploy and verify: watch for any late-arriving reader (a cron job, a separate service, a reporting pipeline) that's still reading the old schema and will now see stale data from this point forward. The existence of such a reader is itself a signal that the pattern wasn't applied at every consumer — go apply it there before proceeding.

Rollback window narrows: reverting the application restores writes to the old schema but the old schema is now stale on everything written since this deploy. The safe rollback window is bounded by how quickly you can detect a problem — typically minutes to hours.

Step 6 — Delete old¶

After a waiting period (the post suggests "a few days of no issues"), drop the old column / table. At this point the old schema is gone; rollback from this point requires restore-from-backup.

Optional deprecation-discovery step: before the final DROP, make the old column / table discoverable-broken:

MySQL 8.0: mark the column invisible (invisible column) — it's hidden from SELECT * but still referenceable by name. Any code that reads it via SELECT * silently loses access; any code that explicitly names it errors loudly, triggering a fix before the final drop.
Table rename: rename the table to something like star_DEPRECATED_2026_04. Any code still referencing the old name errors; after the agreed-upon window, drop the renamed table.

The waiting period is "longer than the longest feedback loop that touches this column" — for a column read by a daily cron job, a day is enough; for a column read by a quarterly batch report, a quarter is the right window.

Why the pattern works¶

Each step has a working rollback target¶

The core property: at every step, rolling back the most recent deploy restores a fully-working system.

After step	Old schema	New schema	Rollback target
1 Expand	read+write complete	empty	redeploy app without new column
2 Dual-write	read+write complete	write-complete from now	remove new-write from app
3 Backfill	read+write complete	read-complete	drop backfilled data, repeat later
4 Read from new	read-complete (dual-written), write-complete	read+write complete	revert app to reading old
5 Stop old writes	read-complete (frozen)	read+write complete	restore dual-writes in app
6 Delete old	gone	read+write complete	restore-from-backup only

Each deploy is a safe-by-construction transition between adjacent safe states. Contrast with a direct RENAME COLUMN + simultaneous app deploy: the system is not safe during the brief window where the DDL has committed and some app instances have the new code while others have the old.

Decoupling the deploy clocks¶

The pattern operationalises the principle that application-code and database-schema have independent deployment clocks that cannot synchronise. Canonicalised at concepts/coupled-vs-decoupled-database-schema-app-deploy. The six-step pattern is the concrete technique that makes decoupled deploys viable; without it, every schema change requires a maintenance window.

Forcing backward compatibility as a free property¶

Because the pattern requires each step to work with both the old and new code running simultaneously (during rollout), backward compatibility falls out as a construction property, not a design goal. The schema during each step is compatible with the previous step's code, and the code for each step is compatible with the previous step's schema.

Worked example: consolidating a `star` table into `repo`¶

(From the PlanetScale post: source tracks GitHub star counts in a separate star table; wants to merge into repo.)

Step	Action	SQL / code
1 Expand	Add `repo.star_count`	`ALTER TABLE repo ADD COLUMN star_count INT;`
2 Dual-write	App writes to both	`UPDATE repo SET star_count = ?; UPDATE star SET star_count = ?;`
3 Backfill	Script `UPDATE repo SET star_count = (SELECT star_count FROM star WHERE ...)`	one-shot (small data)
4 Read new	App reads `repo.star_count`	Verify via APM
5 Contract	App stops writing `star.star_count`	single-deploy
6 Delete	`DROP TABLE star` after waiting period	single DDL

Variants at other substrates¶

The six-step shape generalises beyond relational DDL:

DynamoDB re-keying: new_key attribute + dual-write
backfill via Scan + read-from-new + stop-write + GC old keys.
Elasticsearch index reshape: new index + index alias
dual-write (new + old) + reindex (backfill) + swap alias (read-from-new) + stop-writing-old + delete-old.
Kafka topic reshape: new-topic + dual-publish + consumer-cutover + old-topic-drain + old-topic-delete.
Protobuf schema change: add new field tag + dual-populate + migrate consumers + stop-populating-old
remove-old-field. (Note proto3 semantics already encourage this via explicit-optional and oneof over enum + field.)
API endpoint migration: new endpoint + dual-emit (old + new clients hit matching endpoints) + migrate clients + retire old endpoint.

All variants share the invariant: "add new representation without breaking old → make both representations track the truth → flip authoritative source → retire old representation."

Caveats and failure modes¶

Mid-step failure recovery¶

The pattern assumes clean cutovers between steps. Failure modes that are not addressed by the pattern alone:

Mid-rollout deploy split (Step 4): if the new-read deploy only reaches half the fleet before a failure, reads are split across old + new for the unrolled-out duration. Mitigation: feature-flag the read-from-new behaviour so the split can be quickly reversed without waiting for the deploy pipeline.
Mid-backfill crash (Step 3): backfill script must be idempotent (use WHERE new_col IS NULL or similar) so restart is safe.
Dual-write race conditions (Step 2): if the old-write and new-write span different transactions, a concurrent update can interleave and leave the two schemas disagreeing. Mitigation: same transaction for both writes.

Cost¶

Six deploys and N weeks instead of one deploy and minutes for a direct RENAME. The cost scales with how many application-code changes the pattern requires — for a simple column rename touching 10 call-sites in one service, a week is typical; for a schema reshape touching dozens of services, months.
2× storage and write load during Steps 2–5 while both schemas are populated.
Discipline overhead: every consumer of the schema (including cron jobs, batch reports, external services) must be identified and migrated. Missing one produces a silent data-drift bug after Step 5.

Where the pattern is insufficient¶

Multi-service / cross-boundary schema — the pattern is single-database / single-application. If a column is read via API or event stream by another service, the pattern must be applied at each service boundary. See concepts/schema-evolution for the CDC sibling of this discipline.
Irreversible data transformations — e.g. hashing a column before storage. The pattern assumes the transformation is one-way reproducible in Step 3; if it isn't, pick a transformation that is, or accept that Step 6 is truly irreversible.
Schema with physical-layout implications — changing a column's type from INT to BIGINT via expand-migrate-contract is fine; changing it from VARCHAR to TEXT on a table with millions of rows may not be expressible without a full table rewrite, which the pattern doesn't accelerate (but still de-risks by making the app-code side independent of the DDL).

Seen in¶

— Holly Guevara (PlanetScale, 2022-08-29). Laravel-idiomatic instantiation of the pattern: "Changing the name of a column or table is a little more tricky and requires a multi-step process. To avoid downtime, you don't want to change the name directly, but rather clone the column and rename it there." The eleven-step rename-table recipe in the post (clone-new-table → dual-write → backfill via script → flip reads → drop-old) is the Laravel instance of the six-step expand-migrate-contract canonicalised below, composed with PlanetScale deploy-requests as the execution substrate. Canonical example of the pattern at framework-workflow altitude (not at schema-migration-library altitude).
— canonical definitional reference on the wiki. Taylor Barnett (PlanetScale) walks the full six-step sequence with MySQL-specific safety primitives (MySQL 8.0 invisible-column as the Step-6 deprecation-discovery tool, PlanetScale branching + Insights as the convenience substrate) and a worked example merging a star table into a repo table. The post's thesis sentence: "This ensures that at any step of the process, you can rollback without data loss or significant disruptions to users. This greatly reduces the risk and allows you to move faster and with confidence."

concepts/coupled-vs-decoupled-database-schema-app-deploy — the first-principles framing the pattern operates on.
concepts/backward-compatibility — the general property; this pattern is its schema-DML operationalisation.
concepts/schema-evolution — sibling discipline for async-CDC pipelines where the pattern is the producer side of the evolution.
concepts/mysql-invisible-column — MySQL 8.0 primitive making Step 6 deprecation-discoverable.
patterns/dual-write-migration — Step 2's generalisation across different substrates (protocol migration, observability platform migration).
patterns/shadow-migration — sibling risk-reduction migration pattern; shadow-migration runs both systems simultaneously and reconciles, while expand-migrate-contract stages the swap through a shared database.
systems/planetscale-insights — Step-4 production- performance-monitoring aide.
systems/mysql — canonical substrate; invisible-column is MySQL-specific.

Seen in¶

— operational-instruction altitude. Coutermarsh (2024) distils the full six-step pattern into two PR-bot-emitted rules: "When removing a column, application code must be deployed before the schema is changed"; "When adding a column, application code must be deployed after the schema is changed." Canonicalised as concepts/schema-change-deploy-order. The bot reads the migration diff, classifies the change, and emits the appropriate two-step instruction — side-stepping the need for every engineer to internalise the full pattern, while still composing the full expand-migrate-contract discipline when the change is structural (rename / type-change / split / merge).

Expand-migrate-contract (parallel change)¶

When to use it¶

The six steps¶

Step 1 — Expand¶

Step 2 — Dual-write¶

Step 3 — Backfill¶

Step 4 — Read from new¶

Step 5 — Contract (stop writing to old)¶

Step 6 — Delete old¶

Why the pattern works¶

Each step has a working rollback target¶

Decoupling the deploy clocks¶

Forcing backward compatibility as a free property¶

Worked example: consolidating a star table into repo¶

Variants at other substrates¶

Caveats and failure modes¶

Mid-step failure recovery¶

Cost¶

Where the pattern is insufficient¶

Seen in¶

Related¶

Seen in¶

Worked example: consolidating a `star` table into `repo`¶