CONCEPT Cited by 1 source
Process-per-connection Postgres¶
Definition¶
Postgres uses a "process per user"
client/server model: each client connection is served by a
dedicated OS process (a postmaster-forked backend), not
by a thread, a goroutine, or an async state machine. The
postmaster listens for new TCP connections, fork()s a
backend for each one, and that backend handles the session
until the client disconnects.
This model is simple, robust, and decades-old — crashes in one backend can't corrupt another's memory — but it comes with a set of scalability costs that make connection pooling essential at scale.
Costs¶
- Memory per backend — each Postgres backend allocates session-local state, prepared-statement plan caches, and several buffers. In production Postgres, 5-10 MB per connection is a common figure; 1000 connections = several GB of resident memory.
- Context-switch overhead — hundreds or thousands of processes competing for CPU time trigger frequent context switches. Each switch flushes TLB entries, cache lines, register state.
- CPU migration — the kernel load-balancer may move backends between cores, breaking cache locality.
fork()cost — connection setup time is non-trivial for short-lived connections (a common ORM / serverless pattern).GetSnapshotDataO(N) — see concepts/getsnapshotdata-o-n. The transaction visibility function has historically scanned the entire process table, making the per-transaction cost scale with connection count.
Why it matters¶
The combination of these costs is the single biggest reason connection poolers exist in the Postgres ecosystem. A typical deployment:
- Application tier: 10,000+ client connections (per-pod pools across a Kubernetes fleet).
- PgBouncer tier: accepts all 10,000, multiplexes over…
- Postgres tier: a tuned ceiling of 100-400 real backend processes.
The pooler effectively converts a process-per-connection model into an illusion of process-per-client with a much smaller real footprint — at the cost of some session-state restrictions (no long-running transactions across pool borrows, limitations on prepared statements depending on pooling mode).
Contrast with thread-per-connection databases¶
MySQL and MongoDB use thread-per-connection models (or thread pools). These have lower per-connection memory and startup cost (threads are cheaper than processes) but share the same working-set and context-switch problems past thousands of connections — they too benefit from poolers, just further up the curve.
Seen in¶
- sources/2020-06-23-zalando-pgbouncer-on-kubernetes-minimal-latency — Zalando frames the process model as one of the two foundational motivations for connection pooling (the other being concepts/getsnapshotdata-o-n). "PostgreSQL uses a 'process per user' client/server model, in which too many connections mean too many processes fighting for resources and drowning in context switches and CPU migrations."
Related¶
- systems/postgresql · systems/pgbouncer
- concepts/process-os · concepts/context-switch
- concepts/getsnapshotdata-o-n — the companion per-connection cost.
- patterns/process-per-connection-database — the cross-database pattern.