PATTERN Cited by 2 sources
Routing rule swap cutover¶
Problem¶
After a long, careful, verified zero-downtime migration, the final step is still risky in most architectures: switching the application's queries from the old database to the new one. The naive approaches all have visible failure modes:
- Change the application's connection string. Requires coordinated rollout across every application pod, is not atomic across tables, produces a reconnection storm, is slow to revert.
- DNS flip. Atomicity is best-effort (DNS caches outlive flip), revert is at DNS-TTL resolution.
- Stop old DB, start new DB with same address. Briefly unavailable, not reversible.
The underlying issue: most architectures have no query-level routing primitive that can be flipped atomically, at sub-second granularity, without the application participating.
Solution¶
Put a query-aware proxy layer between the application and the database that terminates the database wire protocol, applies per-table routing rules to decide which backend to send each query to, and can be told to update those rules atomically. The cutover is then:
- Stop writes on the source keyspace.
- Buffer incoming queries at the proxy (concepts/query-buffering-cutover).
- Wait for replication to fully catch up.
- Atomically update the routing rules so queries for the migrated tables go to the new keyspace instead of the source keyspace.
- Release buffered queries against the new backend.
- (Optionally) start a reverse workflow so rollback is possible without data loss (patterns/reverse-replication-for-rollback).
The application sees a brief latency spike on the in-flight queries. It does not see errors, dropped connections, or reconnection storms. It doesn't even know the cutover happened.
Required substrate¶
- Query-aware proxy — MySQL-wire-compatible (VTGate), Postgres-wire-compatible (Pgpool, PgBouncer, PlanetScale proxy), or protocol-agnostic (application gateway). Must be able to speak the database wire protocol end-to-end, not just TCP-forward it.
- Per-table (or per-keyspace, or per-workflow) routing rules that can be updated atomically and that the proxy respects per-query, not per-connection.
- Query buffering — the proxy must be able to pause inbound queries for the brief duration of the flip and release them afterwards.
- Topology-server-level transactionality — the routing rule update itself must be atomic across proxy nodes. In Vitess this is implemented via a distributed topology server with workflow locks.
Canonical wiki instance¶
Vitess MoveTables
SwitchTraffic — the canonical wiki implementation. Full
sequence (Source:
sources/2026-02-16-planetscale-zero-downtime-migrations-at-petabyte-scale):
- Pre-checks on tablet health + replication lag + workflow state.
- Lock source + target keyspaces in topology server.
- Lock workflow (named lock).
- Stop writes on source keyspace.
- Begin buffering incoming queries at VTGate.
- Wait for forward replication to fully catch up.
- Create reverse VReplication workflow for rollback.
- Initialise Vitess Sequences if tables are being sharded.
- Allow writes to target keyspace.
- Atomically update schema routing rules pointing migrated tables at the target keyspace.
- Release buffered queries to target.
- Start reverse VReplication workflow.
- Freeze original (forward) workflow.
- Release locks.
Typical cutover duration: "less than 1 second." The application sees a brief latency spike; no errors.
Composes with¶
- patterns/snapshot-plus-catchup-replication — the data-motion half that the routing-rule swap is the cutover for.
- patterns/vdiff-verify-before-cutover — only flip routing rules after VDiff shows clean.
- patterns/reverse-replication-for-rollback — the
reverse workflow created at cutover enables subsequent
ReverseTraffic(re-swap) without data loss. - concepts/query-buffering-cutover — the load-bearing primitive that makes the brief write-pause invisible to the application.
Seen in¶
-
sources/2026-04-21-planetscale-bring-your-data-to-planetscale — canonical wiki instance of the sister pattern at PlanetScale. Phani Raju's 2021 launch post describes an earlier variant of PlanetScale Imports where cutover is not an atomic sub-second routing-rule swap but instead an explicit direction reversal ("Enable primary mode") preceded by an extended bidirectional-routing validation phase (see patterns/database-as-data-router). Both patterns are built on the same primitives (routing rules, VReplication, reverse replication, unmanaged tablets), but differ in the cutover ceremony: this page's pattern makes cutover an instant with pre-cutover validation by VDiff / replicas; the sister pattern makes cutover an operator-driven direction flip with pre-cutover validation by real application traffic running against the destination-as-proxy. The 2021 post canonicalises the bidirectional-validation alternative; the 2026 post canonicalises the atomic-swap default.
-
sources/2026-02-16-planetscale-zero-downtime-migrations-at-petabyte-scale — canonical wiki instance. Matt Lord documents the exact
MoveTables SwitchTrafficsequence Vitess runs, names VTGate's routing-rule update as the explicit moment of cutover, and frames the whole sequence as sub-second. The sequence is presented as the standard approach for all PlanetScale customer migrations. Key architectural property: the application's connection string never changes — only the routing rule at the proxy layer does.
Related¶
- concepts/schema-routing-rules
- concepts/query-buffering-cutover
- concepts/reverse-replication-workflow
- concepts/online-database-import
- systems/vitess
- systems/vitess-movetables
- systems/planetscale
- patterns/snapshot-plus-catchup-replication
- patterns/vdiff-verify-before-cutover
- patterns/reverse-replication-for-rollback
- patterns/database-as-data-router
- companies/planetscale