Skip to content

PATTERN Cited by 2 sources

Read replica as migration source

Problem

A zero-downtime migration reads huge amounts of data from the source database over hours, days, or weeks. Even when the read pattern is maximally polite (primary-key-ordered sequential scans, short transactions, no table locks), the aggregate load is non-trivial:

  • The initial copy phase scans every row of every table — potentially terabytes.
  • The continuous replication phase runs binlog-dump connections indefinitely.
  • The pre-cutover VDiff full-table-scans both sides again.
  • All of this is in addition to the brief LOCK TABLES ... READ at snapshot setup time.

Running this load against the production primary adds pressure exactly where you don't want it: on the system currently serving live customer traffic. Even when the load doesn't cause a visible outage, it eats into headroom, buffer pool, and I/O capacity — all while the customer is already nervous about a migration in flight.

Solution

Run the migration's source reads against a dedicated read replica of the production primary, not against the primary itself. MySQL binlog replication is built for this — production primaries typically already have one or more replicas for read-scaling, failover, and backups. Point the migration at one of them (or provision one specifically for the migration).

Properties bought:

  • Zero load on the production primary — every migration-related read lands on the replica.
  • The brief LOCK TABLES ... READ at snapshot setup happens on the replica, not on production. Even the millisecond-scale lock is kept off the serving path.
  • Failure blast-radius is bounded to the replica. If the migration inadvertently overloads its source, only the replica suffers; the primary is unaffected.
  • Replicas can be provisioned specifically for the migration and destroyed after — sized and configured for bulk-read workload, different from production optimisation.

Requirements

  • The replica must have binlogs enabled (log_slave_updates = ON or equivalent) so it can relay binlog events to the migration — the migration consumer tails the replica's binlogs, not the primary's.
  • The replica must be current enough that starting the migration from its GTID position is acceptable. (Any replication-lag-at-start is irrelevant once the migration runs — the migration's own streams catch up via binlog as part of normal operation.)
  • The replica must be stable — a flapping replica produces a flapping migration.

Applicability

Aggressively — this is the default shape at any non- trivial scale. The operational cost of provisioning a replica is tiny compared to the risk cost of running the migration against production.

The pattern also generalises to non-MySQL substrates where a replica / read-standby exists:

  • PostgreSQL — stream from a hot standby / read-replica.
  • MongoDB — read from a secondary.
  • Any system with a reader/writer topology — prefer the reader endpoint for bulk reads.

Seen in

  • sources/2026-02-16-planetscale-zero-downtime-migrations-at-petabyte-scale — canonical wiki instance. Matt Lord's explicit recommendation: "As you can imagine, executing all of this work on your current live database instance can be somewhat heavy or expensive and potentially interfere with your live application traffic and its overall performance (and there is that brief window in step a where we take a read-only table level lock to get a GTID set/position to go along with the consistent snapshot of the table). It's for this reason that we recommend you setup a standard MySQL replica — if you don't already have one — and use that as the source MySQL instance for the migration. This is another key factor that ensures we not only avoid downtime, but we avoid any impact whatsoever on the live production system that is currently serving your application data."
Last updated · 319 distilled / 1,201 read