Skip to content

CONCEPT Cited by 1 source

Process-per-connection Postgres

Definition

Postgres uses a "process per user" client/server model: each client connection is served by a dedicated OS process (a postmaster-forked backend), not by a thread, a goroutine, or an async state machine. The postmaster listens for new TCP connections, fork()s a backend for each one, and that backend handles the session until the client disconnects.

This model is simple, robust, and decades-old — crashes in one backend can't corrupt another's memory — but it comes with a set of scalability costs that make connection pooling essential at scale.

Costs

  1. Memory per backend — each Postgres backend allocates session-local state, prepared-statement plan caches, and several buffers. In production Postgres, 5-10 MB per connection is a common figure; 1000 connections = several GB of resident memory.
  2. Context-switch overhead — hundreds or thousands of processes competing for CPU time trigger frequent context switches. Each switch flushes TLB entries, cache lines, register state.
  3. CPU migration — the kernel load-balancer may move backends between cores, breaking cache locality.
  4. fork() cost — connection setup time is non-trivial for short-lived connections (a common ORM / serverless pattern).
  5. GetSnapshotData O(N) — see concepts/getsnapshotdata-o-n. The transaction visibility function has historically scanned the entire process table, making the per-transaction cost scale with connection count.

Why it matters

The combination of these costs is the single biggest reason connection poolers exist in the Postgres ecosystem. A typical deployment:

  • Application tier: 10,000+ client connections (per-pod pools across a Kubernetes fleet).
  • PgBouncer tier: accepts all 10,000, multiplexes over…
  • Postgres tier: a tuned ceiling of 100-400 real backend processes.

The pooler effectively converts a process-per-connection model into an illusion of process-per-client with a much smaller real footprint — at the cost of some session-state restrictions (no long-running transactions across pool borrows, limitations on prepared statements depending on pooling mode).

Contrast with thread-per-connection databases

MySQL and MongoDB use thread-per-connection models (or thread pools). These have lower per-connection memory and startup cost (threads are cheaper than processes) but share the same working-set and context-switch problems past thousands of connections — they too benefit from poolers, just further up the curve.

Seen in

Last updated · 476 distilled / 1,218 read