Skip to content

PATTERN Cited by 1 source

Dual-write migration

Dual-write migration is a pattern for protocol/backend transitions where producers emit to both the legacy and the new system simultaneously for the overlap window. The new system is validated against the old one in production with real traffic; cutover and decommission happen only after it's proven out.

Why it works for protocol migrations

  • No big-bang switch. The old pipeline stays whole-stack authoritative until dashboards/alerts are validated against the new one.
  • Low per-team friction when the dual-write is done in a shared platform library: one library change, many services migrated.
  • Immediate visibility into scale issues. Emission volume is full production from day one, so bottlenecks surface before you're depending on the new pipeline.

Airbnb made a shared metrics library dual-emit StatsD + OTLP to migrate ~40% of services (those on the shared lib) with a single config change; only the stragglers needed per-service work. (Source: sources/2026-04-16-airbnb-statsd-to-otel-metrics-pipeline)

Costs

  • 2× emission load during overlap. Can itself become the thing that breaks — Airbnb's highest-cardinality emitters regressed under the combined OTLP+StatsD load and had to move to delta temporality (concepts/metric-temporality).
  • 2× storage / ingest cost downstream for the migration window.
  • Consistency work if the two systems disagree (different aggregation semantics, different retention) — differences need to be explained, not hidden.

Where to implement the fork

Best place: a shared instrumentation library used by many services. One change, wide coverage, easy rollback. Second-best: a collector sidecar that can translate between protocols (but a translation layer adds CPU and a format-mismatch failure mode — Airbnb specifically called out removing StatsD→OTLP translation from the OTel Collector as a win of native OTLP emission).

Exit criteria (before removing the legacy path)

  • All dashboards and alerts ported and validated against the new system.
  • Query-result equivalence checked on a sample of high-priority metrics.
  • Consumer tooling (LLMs, runbooks) updated to the new data model.
  • A documented rollback plan, in case the new pipeline surprises you after cutover.

Seen in

  • sources/2026-04-16-airbnb-statsd-to-otel-metrics-pipeline — shared metrics library dual-emits StatsD + OTLP; enabled broad OTLP rollout with low friction.

  • dual-write as Step 2 of the canonical expand-migrate-contract pattern for relational-schema migrations. Taylor Barnett's walkthrough specialises the dual-write pattern to within-a- single-database: application writes every mutation to both the old and new schema (column, table, or restructured representation) for the overlap window, reads continue from old until Step 4's cutover. Different substrate from Airbnb's cross-system protocol migration (StatsD → OTLP, two different monitoring platforms), but structurally identical invariant: "add the new target, dual-emit, migrate reads, decommission the old target." PlanetScale's framing extends the pattern with (a) explicit backfill step (Step 3 between dual-write and read-from-new) to close the gap for historical data that preceded dual-writes, (b) deliberate transaction discipline (both writes in one transaction so neither-or-both), and (c) MySQL-specific invisible-column deprecation at Step 6 for discoverable column removal. Positions dual-write as the foundational Step 2 primitive underneath the six-step schema-migration dance, reusable from single-database migrations up to cross-system protocol migrations.

  • — Justin Gage (guest post, 2023-04-06) surfaces Notion's canonical four-step unsharded-to-sharded migration: double-write → backfill → verification → switch-over. Canonicalised as notion- double-write-backfill-verify-switchover — a named specialisation of dual-write-migration for the database-cluster- altitude sharding cutover. The four-step explicit structure adds verification as an operational gate between backfill and switch-over (not present in Airbnb's OTLP cutover which trusted dual-read comparison on the fly), and adds incremental switch-over via double-reads (the reads-side analog of the writes-side double-write) as a named sub-phase. Also adds the honest caveat that "Each of these steps still introduces the possibility of downtime; it's just a risk you're going to have to take for changes at this scale." Positions the four-step as the canonical reference sequence for hand-rolled unsharded-to-sharded migrations; on Vitess the primitive composes with Reshard-online-via- VReplication which provides backfill+tail + VDiff + SwitchTraffic as the substrate-layer implementation of the same four phases.

Last updated · 542 distilled / 1,571 read