Skip to content

PATTERN Cited by 1 source

SO_REUSEPORT multi-process single-port

Statement

Scale a single-threaded event-loop service beyond one CPU core on the same host by running N independent processes and letting them share a listening port via the Linux SO_REUSEPORT socket option. The kernel distributes incoming connections across the processes via 4-tuple hash — no user-space load balancer, no multi-threading required.

When to use it

  • Single-threaded event-loop processes that saturate one CPU and can't be meaningfully refactored into threaded / async-multicore forms.
  • PgBouncer — the canonical instance; so_reuseport is the only way PgBouncer uses more than one core on a single host.
  • Older HAProxy / nginx deployments pre-multithreaded worker pools (or when running them in per-process worker mode).
  • Custom Go/Rust/C services designed as single-process event loops.

Why it works

  • Kernel-level load balancing is zero-configuration and zero-overhead — the routing decision happens in the socket lookup path.
  • Per 4-tuple hashing is stable per connection, so each TCP flow stays with the same worker (good for connection state locality).
  • Horizontal scaling on a single host — no orchestration, no sidecar, no proxy. Launch N copies, configure SO_REUSEPORT, done.
  • Pairs with CPU pinning — pin each worker to a specific core (see patterns/fixed-cpu-pinning-for-latency-sensitive-pool) and you have a deterministic multi-core deployment without shared memory.

Trade-offs

  • Connection-level hashing, not request-level — a heavy long-lived connection pins a worker. Uneven connection weight → uneven worker load.
  • No shared state between workers — caches, prepared statement caches, session state are per-worker. Applications that assume a single logical server behind the port must tolerate worker-local state.
  • Requires Linux kernel ≥ 3.9 and per-listen-socket opt-in; BSD has a subtly different semantics.
  • Fate-sharing within a host — if the host dies all N workers die together; this pattern is orthogonal to multi-host replication for availability.

Pairings

Seen in

  • sources/2020-06-23-zalando-pgbouncer-on-kubernetes-minimal-latency — Zalando runs two PgBouncer processes with so_reuseport as part of their benchmark harness specifically to explore how CPU placement of the two instances affects latency. The pattern is inseparable from the CPU-pinning conversation: multiple SO_REUSEPORT workers only make sense if you can control where they land.
Last updated · 476 distilled / 1,218 read