Skip to content

CONCEPT Cited by 3 sources

Online database import

Definition

Online database import is the capability to move an existing production database from one vendor or host to another without downtime — the source database stays write-available throughout, the destination catches up in parallel, and cutover is a protocol-level swap with at most single-digit-seconds of serialised-write blackout.

Canonical mechanisms:

  • Logical replication — vendor-to-vendor replay of decoded WAL / binlog changes; destination applies them as regular SQL. Postgres has native logical replication (CREATE PUBLICATION/SUBSCRIPTION, pg_output); MySQL has row-based binlog replication; many vendor-specific decoders exist (Debezium, AWS DMS, Google DMS).
  • Snapshot + catch-up stream — take a consistent snapshot via pg_dump / mysqldump / vendor snapshot, load into destination, then stream WAL / binlog from the snapshot LSN / GTID position forward.
  • Hybrid dump + CDC — initial dump via parallel pg_restore / Percona xtrabackup, then logical or physical replication tails from the dump's consistent point.

The hard parts are not listed in vendor announcements:

  • Schema-compatibility across versions — source on v13, destination on v17: catalog differences, data-type evolution, default-value behaviour, sequence ownership.
  • Extension handling — pgvector, postgis, pg_partman, citus: does the destination carry equivalent versions?
  • Large-object / sequence / custom-type edge cases.
  • Foreign-key-constraint enforcement during catch-up — destination may not have the complete graph yet.
  • Cross-region latency during the catch-up phase — replication lag vs source-write rate must trend to zero before cutover.
  • Cutover sequencing — fence source writes → flush replication lag to zero → redirect clients → unfence destination writes. Bad sequencing = split-brain writes.

Seen in

  • sources/2026-04-21-planetscale-bring-your-data-to-planetscaleearliest wiki canonicalisation of online database import as a user-facing feature (Phani Raju, 2021), four and a half years before the 2026-02-16 Matt Lord deep-dive. Launches PlanetScale's external-MySQL import feature and canonicalises two primitives the later post assumes as background: Vitess unmanaged tablet as the composition primitive that lets Vitess attach to an externally-managed MySQL, and the database-as-data-router shape in which the destination acts as a transparent write-through proxy during validation, with cutover as an explicit direction reversal ("Enable primary mode") rather than an atomic sub-second flip. Canonical statement that dual-writes are the inferior alternative: "This step of routing your application's reads and writes through PlanetScale allows you to safely validate that your application is fully compatible with PlanetScale without taking downtime and without fragile application-level migration mechanisms like dual writes." Also canonicalises the three-stage lifecycle: validate (destination proxies writes back) → primary mode (direction reversed, source kept in sync via reverse-replication) → detach (source disconnected, credentials deleted).

  • sources/2025-07-01-planetscale-planetscale-for-postgres — PlanetScale announces online imports into PlanetScale for Postgres "from any version > Postgres v13" plus automatic Postgres version updates without downtime. First canonical wiki instance of online database import as a vendor-boundary migration capability — vendor A to vendor B — rather than the within-vendor version-upgrade / replica-failover case. PlanetScale does not disclose the underlying mechanism; the v13+ lower bound is likely a Postgres logical-replication-feature- dependency (publication-filtered logical replication, new row-identity options, etc.).

  • sources/2026-02-16-planetscale-zero-downtime-migrations-at-petabyte-scalecanonical wiki mechanism description (on the MySQL side) complementing the 2025-07-01 announcement (capability, no mechanism). Matt Lord walks the exact primitives Vitess uses under the MoveTables workflow that powers PlanetScale Database Imports: consistent non-locking snapshot (LOCK TABLES READSTART TRANSACTION WITH CONSISTENT SNAPSHOT@@global.GTID_EXECUTED → unlock), primary-key-ordered row streams filtered per destination shard, interleaved copy-plus-binlog-replication catch-up cycles to stay inside the source's binlog retention horizon, post-copy continuous replication via COM_BINLOG_DUMP_GTID with persisted GTID progress for restart semantics, a VDiff zero-downtime consistency check before cutover, an atomic schema- routing-rule swap at cutover with VTGate query buffering making the switch invisible to the application, and a reverse VReplication workflow kept running until Complete so the cutover is reversible. Typical cutover is "less than 1 second." Canonical statement that fault-tolerance is a correctness requirement for workflows that run at petabyte scale over days-to-weeks: "the likelihood of encountering some type of error … becomes increasingly likely." Explicitly covers sharding-as-part-of-migration (unsharded external MySQL split into N shards in-flight) and recommends running the migration against a replica so the primary sees zero load. This post thereby gives the canonical MySQL-side mechanical content for the online-database-import concept — likely a close cognate of what the Postgres side does internally on top of logical-replication slots.

Last updated · 319 distilled / 1,201 read