CONCEPT Cited by 1 source
GetSnapshotData O(connections)¶
Definition¶
GetSnapshotData is the Postgres
internal function that builds the MVCC snapshot a
transaction uses to determine which tuples are visible to it.
Historically (and in Postgres < 14), GetSnapshotData has
O(N) complexity in the number of open connections — it
scans the PGPROC array to collect active transaction IDs
for building the visibility cutoff.
The implication: every open connection, even an idle one, makes every new transaction in the cluster slightly slower. Connection count becomes a first-class cost dimension, not just a memory / file-descriptor dimension.
Why this is a scaling cliff¶
- 1000 idle connections are cheap per-idle-connection but
make
GetSnapshotData1000× more work than with one. - Short transactions are dominated by snapshot setup, so the relative impact on a busy workload is large.
- The effect shows up as unexpectedly poor p99 latency at high connection counts — throughput holds but tail degrades.
This is one of the strongest architectural arguments for a
connection pooler (PgBouncer, Pgpool-II, etc.) sitting in
front of Postgres: the pooler holds thousands of client
connections while only a small, tuned number of connections
reach Postgres, keeping GetSnapshotData cheap.
Upstream history¶
- Postgres 14 (2021) shipped Andres Freund's patches that
made
GetSnapshotDataO(1) in the common case by introducing a cached snapshot and a global xmin array redesigned for scalability. The commit message and associated mailing-list discussion framed this as one of the larger connection-scalability wins in recent Postgres history. - Older Postgres versions (12, 13) still carry the O(N) cost. Not every workload hits it at the same connection count, but beyond a few hundred active backends the curve bends.
Seen in¶
- sources/2020-06-23-zalando-pgbouncer-on-kubernetes-minimal-latency
— Zalando's Postgres Operator team cites
GetSnapshotData's complexity as one of the two canonical reasons for needing a connection pooler — alongside the process-per- connection memory / context-switch overhead. Kukushkin links directly to Andres Freund's 2020 thread noting "GetSnapshotDatahasO(connections)complexity" — the thread that led to the Postgres 14 fix.
Related¶
- systems/postgresql · systems/pgbouncer — the pooler that sidesteps the cost.
- concepts/process-per-connection-postgres — the sister connection-cost dimension.
- concepts/snapshot-isolation — the isolation model
GetSnapshotDataimplements.