CONCEPT Cited by 1 source
Commit Sequence Number (CSN)¶
Definition¶
A Commit Sequence Number (CSN) is a monotonically-increasing integer stamped onto a transaction at commit time, used as the canonical position of that commit in the database's global commit history. MVCC visibility is then re-expressed as "a snapshot with watermark S sees every transaction with CSN ≤ S and none with CSN > S." This replaces the ProcArray-scan approach ("which xids are currently in-flight, exclude them from this snapshot") with a strictly ordered, monotonic one.
CSN is the proposed upstream fix for Postgres's Long Fork anomaly. By ordering commits by a sequence number that is assigned atomically with the WAL commit record, visibility order is made equal to commit order — eliminating the visibility-vs-commit-order divergence that is the root cause of the anomaly.
Per AWS's 2025-05-03 response to Jepsen's RDS Multi-AZ analysis (Source: sources/2025-05-03-aws-postgresql-transaction-visibility-read-replicas):
"One such solution makes the visibility order match the commit order using Commit Sequence Numbers (CSNs). The proposed fix is rather involved and spans multiple patches."
Why it is "rather involved"¶
CSN is superficially a small change (add a counter, replace ProcArray scan with a watermark comparison) but load-bearing against many Postgres internals:
- Commit path — CSN assignment must be atomic with WAL commit-record fsync so commit order = CSN order without re-introducing skew.
- Snapshot acquisition — shifts from "scan an in-memory list of active xids" to "read the current CSN watermark"; the cost model changes (usually downward — scanning
ProcArrayat thousands of connections is "a measurable fraction of CPU" on large Postgres servers). - Visibility checks on tuples — every tuple needs to map back to a CSN (directly, or via a mapping layer); existing code paths are keyed on xid.
- Old-snapshot handling — CSN watermarks have to interact correctly with VACUUM, long-running transactions, subtransactions, and prepared transactions.
- Replication — replicas need to apply WAL and update their CSN watermark in lockstep; the protocol has to carry CSN information.
- Backward compatibility — existing extensions, monitoring tools, and user-facing xid-exposing functions can't silently break.
The multi-patch series was discussed on pgsql-hackers and presented at PGConf.EU 2024 ("High-concurrency distributed snapshots," Ants Aasma). No upstream landing date disclosed. AWS's PostgreSQL Contributors Team (formed 2022) is participating.
Why fixing this is worth the cost¶
The Long Fork anomaly rarely impacts end-user application correctness (most apps serialize via explicit row conflicts or app-level ordering). But it blocks a family of enterprise-grade distributed-Postgres capabilities (Source: sources/2025-05-03-aws-postgresql-transaction-visibility-read-replicas):
- Distributed-SQL consistent snapshots — requires a globally-consistent view of pending transactions. With
ProcArray, this is "practically infeasible" across nodes; with CSN, you're comparing watermarks. - Query routing to synchronously-caught-up replicas without non-repeatable-read surprises.
- Snapshot-on-primary + WAL-replay sync pipelines that don't land in never-observable states.
- Point-in-time restore to an LSN that produces an observable-on-primary state.
- Storage-layout optimization that replaces xid with logical/clock-based commit time in tuples without breaking query repeatability.
- CPU reclamation — scanning
ProcArraybecomes unnecessary at snapshot time.
In short: ProcArray as the visibility substrate is both wrong (under the formal SI definition) and expensive (under the large-connection production workload). CSN addresses both.
Alternatives and adjacent shapes¶
- Time-based MVCC — instead of a dense monotonic integer, use a (logical or physical) clock as the visibility key. Conceptually similar to CSN but with different distributed-clock trade-offs. This is the shape systems/aurora-dsql and systems/aurora-limitless use, replacing Postgres's
ProcArray-based visibility wholesale via the public extension API (see patterns/postgres-extension-over-fork). Obtaining a consistent snapshot is a clock-read, not a fanned-out consensus over every node's pending-xid list. (Source: sources/2025-05-03-aws-postgresql-transaction-visibility-read-replicas.) - Global-sequencer services (Spanner / TrueTime) — commercial distributed SQL with time-backed snapshots using specialized hardware clocks; fuller-replacement shape, not an incremental fix to an existing OSS DB.
Relation to other wiki concepts¶
- concepts/long-fork-anomaly — the specific violation CSN fixes.
- concepts/snapshot-isolation — the isolation model formally requiring what CSN provides.
- concepts/visibility-order-vs-commit-order — the root-cause framing CSN closes.
- concepts/postgres-mvcc-hot-updates — the tuple-versioning layer CSN-based visibility checks would run against; unchanged in principle, but the visibility test becomes a CSN compare instead of a
ProcArrayscan. - concepts/wal-write-ahead-logging — CSN must be stamped atomically with the WAL commit record to preserve commit = visibility ordering.
Seen in¶
- sources/2025-05-03-aws-postgresql-transaction-visibility-read-replicas — AWS's 2025-05-03 response to Jepsen names CSN as the proposed upstream fix for the Long Fork anomaly; multi-patch effort; PGConf.EU 2024 talk; AWS contributing.