CONCEPT Cited by 1 source

Process-per-connection Postgres¶

Definition¶

Postgres uses a "process per user" client/server model: each client connection is served by a dedicated OS process (a postmaster-forked backend), not by a thread, a goroutine, or an async state machine. The postmaster listens for new TCP connections, fork()s a backend for each one, and that backend handles the session until the client disconnects.

This model is simple, robust, and decades-old — crashes in one backend can't corrupt another's memory — but it comes with a set of scalability costs that make connection pooling essential at scale.

Costs¶

Memory per backend — each Postgres backend allocates session-local state, prepared-statement plan caches, and several buffers. In production Postgres, 5-10 MB per connection is a common figure; 1000 connections = several GB of resident memory.
Context-switch overhead — hundreds or thousands of processes competing for CPU time trigger frequent context switches. Each switch flushes TLB entries, cache lines, register state.
CPU migration — the kernel load-balancer may move backends between cores, breaking cache locality.
fork() cost — connection setup time is non-trivial for short-lived connections (a common ORM / serverless pattern).
GetSnapshotData O(N) — see concepts/getsnapshotdata-o-n. The transaction visibility function has historically scanned the entire process table, making the per-transaction cost scale with connection count.

Why it matters¶

The combination of these costs is the single biggest reason connection poolers exist in the Postgres ecosystem. A typical deployment:

Application tier: 10,000+ client connections (per-pod pools across a Kubernetes fleet).
PgBouncer tier: accepts all 10,000, multiplexes over…
Postgres tier: a tuned ceiling of 100-400 real backend processes.

The pooler effectively converts a process-per-connection model into an illusion of process-per-client with a much smaller real footprint — at the cost of some session-state restrictions (no long-running transactions across pool borrows, limitations on prepared statements depending on pooling mode).

Contrast with thread-per-connection databases¶

MySQL and MongoDB use thread-per-connection models (or thread pools). These have lower per-connection memory and startup cost (threads are cheaper than processes) but share the same working-set and context-switch problems past thousands of connections — they too benefit from poolers, just further up the curve.

Seen in¶

sources/2020-06-23-zalando-pgbouncer-on-kubernetes-minimal-latency — Zalando frames the process model as one of the two foundational motivations for connection pooling (the other being concepts/getsnapshotdata-o-n). "PostgreSQL uses a 'process per user' client/server model, in which too many connections mean too many processes fighting for resources and drowning in context switches and CPU migrations."

systems/postgresql · systems/pgbouncer
concepts/process-os · concepts/context-switch
concepts/getsnapshotdata-o-n — the companion per-connection cost.
patterns/process-per-connection-database — the cross-database pattern.