Skip to content

CONCEPT Cited by 2 sources

Consistent non-locking snapshot

Definition

A consistent non-locking snapshot is a point-in-time view of a database (or a table) from which large amounts of data can be read without blocking concurrent writes and without the read results changing for the duration of the snapshot's use. The "consistent" half guarantees that every row read reflects the same logical point in time; the "non-locking" half guarantees that opening the snapshot does not make the source database read-only or write-only for the duration.

The canonical MySQL-specific recipe (used by Vitess VReplication and Vitess VDiff):

LOCK TABLES <tbl> READ;              -- briefly, milliseconds
START TRANSACTION WITH CONSISTENT SNAPSHOT;
SELECT @@global.GTID_EXECUTED;       -- record position metadata
UNLOCK TABLES;                       -- release immediately
-- Subsequent reads in this transaction see a consistent view
-- of the table as of the moment the transaction opened.

The brief LOCK TABLES ... READ is not there to serialise reads against writes — it's there to ensure that the snapshot opened by START TRANSACTION WITH CONSISTENT SNAPSHOT and the GTID_EXECUTED value read immediately after correspond to exactly the same logical point in time on the server. Without the table lock, a concurrent committed write could slip in between the snapshot's materialisation and the read of GTID_EXECUTED, leaving the follow-on binlog replication starting from a position that isn't consistent with the snapshot's row data. Once both values are safely captured, the lock is released and the database remains fully available.

Why this matters for data motion

A consistent non-locking snapshot is the foundation of every zero-downtime database migration. If the snapshot were locking, the source database would be unavailable for writes for however long the full-table copy takes — at petabyte scale, hours to weeks. If the snapshot were non-consistent, the copy would include rows from multiple logical times and any follow-on replication catch-up wouldn't know where to start from — corrupt destination state is the outcome.

Other databases offer analogous primitives:

  • PostgreSQLSET TRANSACTION ISOLATION LEVEL REPEATABLE READ or SERIALIZABLE + BEGIN + pg_export_snapshot() exports a snapshot token that concurrent transactions can import via SET TRANSACTION SNAPSHOT. Combined with the LSN from pg_current_wal_lsn() or a replication-slot LSN.
  • Oracle — READ-ONLY or flashback queries with SCN (system change number).
  • MongoDB — causally-consistent sessions plus the oplog timestamp.

The shape is universal: snapshot + replication-position metadata captured atomically is the required substrate for any "copy-now, catch-up-via-replication-later" migration.

Seen in

  • sources/2026-04-21-planetscale-behind-the-scenes-how-schema-reverts-work — the online-DDL application of the same primitive. Guevara + Noach: "When we begin copying a set of rows, we run START TRANSACTION WITH CONSISTENT SNAPSHOT, which takes that snapshot and essentially freezes time while we copy the rows over. This is done using the GTID, Global Transaction ID, which captures the existing state down to the transaction level." Same snapshot recipe as the data-motion case, smaller scope: applied per-batch during the backfill half of a shadow- table online schema change, so the shadow can be populated without blocking writes on the live table. The GTID captured alongside each snapshot batch is the index VReplication uses to interleave copy and change-log phases until the shadow catches up.

  • sources/2026-02-16-planetscale-zero-downtime-migrations-at-petabyte-scale — canonical wiki description of the MySQL-specific recipe: LOCK TABLES <tbl> READSTART TRANSACTION WITH CONSISTENT SNAPSHOT → read @@global.GTID_EXECUTED → release lock. Used on every table of every VReplication workflow Vitess runs, plus on the source and each target shard during VDiff. "At this point we have a consistent snapshot of the table data and the GTID set or 'position' metadata to go along with it so that we can replicate changes to the table that have occurred since our snapshot was taken." The brief READ lock is called out explicitly as the only point in the entire migration when the source database has any read/write coordination cost at all — the rest of the process is genuinely non-impacting.

Last updated · 319 distilled / 1,201 read