PATTERN Cited by 1 source

Process-per-connection database¶

What it is¶

Process-per-connection is a database server architecture where each client connection is served by a dedicated OS [[concepts/ process-os|process]] spawned via fork() at connection time, not a thread inside a shared process. The canonical production example is Postgres: a single supervisor process (PostMaster) accepts new TCP connections and forks a backend process per client; the backend handles all queries for the life of that connection.

Canonical framing¶

"Postgres is implemented with a process-per-connection architecture. Each time a client makes a connection, a new Postgres process is created on the server's operating system. There is a single 'main' process (PostMaster) that manages Postgres operations, and all new connections create a new Process that coordinates with PostMaster." (Source: [[sources/2026-04-21-planetscale-processes- and-threads]])

Why this architecture¶

The choice goes back to Postgres's 1980s design era, when:

Isolation was the priority — a bug in one backend couldn't corrupt another's memory, because they're in separate address spaces.
Shared-memory IPC was the fast path — Postgres uses a shared memory region (buffer pool, lock tables) that backends mmap into; writes are coordinated via explicit latches and locks.
Thread libraries were immature + unportable — POSIX threads (pthread_create) weren't standardised until the late 1990s; processes were the portable concurrency primitive across Unix variants.

The inherited trade-off has persisted: later-generation engines (MySQL, SQL Server, Oracle) adopted thread-per-connection and pay less per-connection memory, but Postgres's design is structurally wedded to processes.

Costs¶

Dicken names the structural criticism verbatim: "Processes are heavy: there is memory overhead and a time overhead for managing them." The concrete costs:

Memory per connection — a Postgres backend carries its own work_mem allocation + query-plan cache + catalog cache + temporary buffers, typically ~5–10 MB steady-state plus whatever the active query needs. A server with 10,000 open connections needs 50–100 GB just for connection overhead.
Fork cost — copying the page tables at fork() time is fast (modern kernels use copy-on-write), but the cost compounds if connection churn is high.
Context switch tax — process switching costs ~5 μs (vs ~1 μs for thread switching). At thousands of active backends, the scheduler spends measurable CPU just swapping between them.
max_connections ceiling — hard configurable limit on concurrent backends; raising it allocates proportionally more shared memory and grows the risk of OS-level [[concepts/memory- overcommit-risk|memory overcommit]].

Mitigations¶

The universal mitigation is external connection pooling (concepts/connection-pool-exhaustion): interpose a pooler (PgBouncer is the canonical choice for Postgres) between applications and the database. The pooler accepts thousands of client connections and multiplexes them over a small pool (typically 5–50) of real Postgres backends.

"Connection poolers sit between clients and the database. All connections from the client are made to the pooler, which is designed to be able to handle thousands at a time. It maintains its own pool of direct connections to the database, typically between 5 and 50." (Source: [[sources/2026-04-21-planetscale- processes-and-threads]])

PgBouncer's three pooling modes (session / transaction / statement) trade isolation for density: transaction pooling is the common production default, releasing the backend back to the pool at COMMIT/ROLLBACK so one backend can serve many transactions from different clients over its lifetime.

When to choose¶

Use process-per-connection when:

Isolation is a hard requirement — e.g. workloads where a crash or memory corruption in one connection's backend must not affect others. Postgres's per-backend crash is typically recoverable without restarting the whole server.
The operational ecosystem expects it — Postgres's ecosystem (extensions, monitoring, connection-pooler lineup, replication tooling) is built around the process model; deviating adds friction.
Connection fan-in is bounded — if you can cap concurrent client connections (via pooler, admission control, or application-tier budgeting), the memory tax is manageable.

Seen in¶

systems/postgresql — the canonical production instance of this pattern.
sources/2025-09-24-planetscale-processes-and-threads — Ben Dicken's canonical wiki disclosure of the process-per-connection vs thread-per-connection dichotomy.
— PlanetScale's Postgres offering adopts PgBouncer as the canonical pooler tier to mitigate the process-per-connection memory cost at scale.
[[sources/2026-04-21-figma-how-figmas-databases-team-lived-to- tell-the-scale]] — Figma's Postgres-at-scale narrative hits the per-backend memory wall at thousands of connections, driving adoption of a custom DBProxy pooling tier.

patterns/thread-per-connection-database — MySQL's alternative architecture with lower per-connection memory cost.
patterns/two-tier-connection-pooling — the structural answer to the memory-cost problem at scale.
concepts/connection-pool-exhaustion — the universal throttling signal this pattern exposes.
concepts/max-connections-ceiling — the hard limit the pattern's memory cost enforces.
concepts/process-os — the OS substrate.
concepts/fork-execve — the system call invoked per connection.
systems/pgbouncer — the canonical external pooler for Postgres.