CONCEPT Cited by 1 source

Snapshot Isolation¶

Definition¶

Snapshot Isolation (SI) is a transaction-isolation model in which each transaction reads from a consistent snapshot of the database as of its start time: it sees all writes committed before that snapshot and none of those committed after, regardless of how many concurrent transactions are in flight. Snapshots are read-locking-free — readers never block writers and writers never block readers — which is the model's operational selling point. Writers still acquire row locks and must resolve conflicts at commit (first-committer-wins, in the classic definition).

Formally (Berenson et al. 1995, Adya et al. 2000), SI forbids:

Dirty reads — reading uncommitted values.
Non-repeatable reads — two reads of the same row in the same transaction returning different values.
Phantom reads — a range-scan changing between two runs in the same transaction.
Lost updates — two concurrent transactions both writing the same row without either seeing the other's write.

SI sits below Serializability in the hierarchy: it allows write-skew anomalies that a truly serializable execution would forbid. Postgres's REPEATABLE READ is SI; its SERIALIZABLE (Serializable Snapshot Isolation, SSI) adds write-skew detection on top.

The atomic-visibility requirement¶

Formal SI carries a global commit-order property as part of its definition: if two readers observe the effects of two concurrent transactions T1 and T2, they must agree on which committed first. "Atomic visibility" in the Cerone/Bernardi/Gotsman (CONCUR 2015) framework. This is the property the concepts/long-fork-anomaly violates.

In a single-node, single-snapshot-source system, global commit order is trivially observed. In a clustered system with read replicas — or on a single Postgres primary where visibility is decoupled from durability (see concepts/visibility-order-vs-commit-order) — atomic visibility becomes a non-trivial distributed-systems problem.

Postgres's implementation is not formally SI¶

Per AWS's 2025-05-03 response to Jepsen's Postgres Multi-AZ analysis (Source: sources/2025-05-03-aws-postgresql-transaction-visibility-read-replicas):

"This behavior represents a deviation from formal Snapshot Isolation guarantees."

Postgres takes a snapshot by scanning the in-memory ProcArray for the list of currently-pending xids. A committing transaction writes its WAL commit record first, then asynchronously removes itself from ProcArray. Concurrent non-conflicting transactions can flip between the two steps, producing a Long Fork. This affects all isolation levels — Read Committed, Repeatable Read, and Serializable all take snapshots via ProcArray.

Practical impact is limited because most applications either (a) serialize via row conflicts that SI catches at commit, or (b) implement app-level ordering via shared counters or timestamps. Enterprise-grade features — distributed queries, read-write splitting, point-in-time restore to LSN, snapshot-then-replay data sync — are where the gap bites.

Why it matters for distributed Postgres¶

systems/aurora-dsql and systems/aurora-limitless replace Postgres's ProcArray-based visibility with time-based MVCC, explicitly to avoid the Long Fork anomaly and offer true formal SI across the cluster. Sidesteps rather than fixes the upstream issue. (Source: sources/2025-05-03-aws-postgresql-transaction-visibility-read-replicas.)
Commit Sequence Numbers (CSN) — the upstream Postgres fix discussed on pgsql-hackers and presented at PGConf.EU 2024. Makes visibility order match commit order by stamping commits with a monotonic sequence number and snapshotting by CSN watermark instead of ProcArray.
The formal consistency-model framework underpinning all of this is Cerone, Bernardi, Gotsman, A framework for transactional consistency models with atomic visibility (CONCUR 2015).

Relation to other wiki concepts¶

concepts/long-fork-anomaly — the specific violation of SI Postgres exhibits in clusters.
concepts/visibility-order-vs-commit-order — the mechanism behind the violation.
concepts/commit-sequence-number — the proposed upstream fix.
concepts/strong-consistency — different property (read-after-write on a single key); orthogonal to SI's multi-key snapshot guarantee.
concepts/postgres-mvcc-hot-updates — the storage-layer mechanism SI rides on (row versions + visibility checks).

Seen in¶

— Brian Morrison II's PlanetScale pedagogical post canonicalises MySQL's REPEATABLE READ as the snapshot-isolation-flavoured level in the InnoDB implementation. "With InnoDB's MVCC (Multi Version Concurrency Control), a row may have multiple versions at any given time, depending on open transactions. In repeatable read level, queries will use a consistent snapshot of the data, pinning the reads to a single transaction ID throughout the transaction." The same foundational SI substrate as Postgres's REPEATABLE READ (which is formally SI). Two notable differences: (1) MySQL's REPEATABLE READ uses gap locks on locking reads to additionally prevent phantom reads, whereas Postgres's SI prevents phantoms structurally via snapshots alone; (2) MySQL's READ COMMITTED creates a fresh snapshot per SELECT, whereas Postgres's READ COMMITTED has similar per-query semantics but with different snapshot-acquisition mechanics. Canonical wiki canonicalisation of the engine-divergence at the REPEATABLE READ name: same isolation-level label, differently-implemented phantom-read protection.
sources/2025-05-03-aws-postgresql-transaction-visibility-read-replicas — AWS's framing of the Postgres Long Fork behavior as a deviation from formal Snapshot Isolation affecting all isolation levels; time-based MVCC in Aurora DSQL / Aurora Limitless as the side-step; CSN as the proposed upstream fix.
— Ben Dicken's first-principles PlanetScale primer explicitly contrasts how Postgres and MySQL achieve SI-style consistent reads: Postgres via multi-version row storage with xmin/xmax tuple metadata (old versions coexist on-heap, reclaimed later by VACUUM FULL), MySQL via an undo log that reconstructs old versions on-the-fly because writes overwrite in place. Same isolation-level name, inverted implementations. The article further contrasts the two engines' SERIALIZABLE write-conflict resolution: MySQL's lock-based pessimism + deadlock detection vs. Postgres's predicate-lock-based optimism with commit-time SSI abort.