Skip to content

PATTERN Cited by 1 source

Background worker pool for async I/O

Problem

You want an application process to stop blocking on synchronous I/O, but the obvious answer (io_uring, libaio) has real downsides:

  • Kernel-interface-specific. io_uring is Linux-only and has a difficult security history — disabled by default in hardened sandbox / container contexts.
  • Keeps post-I/O CPU work in-process. If the caller must checksum / decompress / memcpy each buffer after read, that CPU cost is serialised per calling process, capping the benefit of async I/O.
  • Per-I/O overhead doesn't help at low concurrency. See concepts/async-io-concurrency-threshold.
  • Doesn't distribute across CPU cores automatically. A single calling process issuing async I/Os still runs post-I/O work on one core.

Pattern

Run a pool of dedicated background worker processes (or threads, depending on runtime) that perform I/O on behalf of the main application processes. When a backend needs data, it submits a request to the worker pool via shared memory / IPC and waits on a response. Workers pick up requests, perform the underlying synchronous I/O, and signal completion.

The key properties:

  • Worker-side I/O can be simple read() / write(), no special kernel interface required.
  • Worker processes run on different CPU cores. Post-I/O CPU work (checksums, memcpy, decompression) distributes across the pool naturally.
  • Worker count is tuneable at runtime via configuration.
  • Single-process API — the backend sees a future / promise shape even though the underlying mechanism is cross-process IPC, not in-process async.

Canonical wiki instance: Postgres 18 io_method=worker

Postgres 18 (September 2025) introduced the io_method option with three settings. worker was chosen as the new defaultnot io_uring, the more headline-grabbing option.

Dicken's framing in sources/2025-10-14-planetscale-benchmarking-postgres-17-vs-18:

Using io_method=worker was a good choice as the new default. It comes with a lot of the "asynchronous" benefits of io_uring without relying on that specific kernel interface, and can be tuned by setting io_workers=X.

Postgres's io_workers defaults to 3. Each worker process receives I/O requests from backend processes via shared memory, issues the underlying OS read, and signals completion. From the backend's perspective the interface is async; from the OS's perspective each worker issues plain synchronous reads.

Why worker beats io_uring on Postgres's workload shape

Per Tomas Vondra's tuning blog cited by Dicken:

  • Index scans don't yet use AIO in Postgres 18.x, so B-tree- dominated OLTP workloads benefit from whatever the I/O method happens to be doing for sequential and bitmap scans — which is most benchmark-load.
  • Checksums + memcpy are CPU-bound and serial per Postgres backend. io_uring's async dispatch doesn't help because the backend is CPU-bound, not I/O-bound, after the read completes. worker distributes the CPU cost across worker processes.
  • Process-level parallelism is what Postgres needs. Postgres's process-per-backend architecture already uses separate processes for isolation; adding I/O workers is a natural extension.

PlanetScale's measured data: worker matches or beats io_uring on EBS-backed instances at all tested concurrency levels, and only loses on local NVMe at 50 connections with large range scans.

Variants

  • Thread pool instead of process pool. Runtimes without cheap forking (Java, Go, Rust) prefer a thread pool; the dispatch mechanism becomes in-process queue rather than shared-memory IPC.
  • Per-device worker pool. One worker per underlying storage device to avoid head-of-line blocking across devices.
  • Mixed modes. Some reads go through the worker pool, others stay synchronous; io_uring backfills at a third tier for high-concurrency regimes where the worker overhead starts to cap throughput.

Trade-offs

  • Pro: kernel-interface-agnostic. Works wherever synchronous read() works.
  • Pro: CPU distribution. Post-I/O work parallelises across worker CPUs.
  • Pro: tuneable pool size. Application-layer knob, not kernel parameter.
  • Pro: well-understood failure modes. Workers die like any other process and can be restarted.
  • Con: higher per-I/O overhead. IPC + scheduling cost per request. At very high I/O rates this exceeds io_uring's submission overhead.
  • Con: fixed worker count is a bottleneck. Too few workers = queueing; too many = context-switch overhead.
  • Con: cross-process memory copy. Data read by a worker must be delivered to the backend — typically via shared memory, but still not zero-cost.

When to use this pattern

  • The calling process is CPU-bound post-I/O. Checksumming, decompression, format parsing — CPU work parallelises across workers.
  • You need portability across kernel versions / operating systems.
  • You're already using a process-per-client architecture (Postgres, Apache prefork).
  • Your concurrency sits below io_uring's payoff threshold (see concepts/async-io-concurrency-threshold).

When not to use it

  • Ultra-high I/O rates where IPC overhead dominates (millions of IOPS per calling process). Go kernel-native async.
  • CPU-light workloads where the post-I/O path is nearly free and per-I/O overhead is the whole story.
  • Single-threaded event-loop architectures (Node.js, Redis) that would lose their simplicity gains from adding worker processes.

Seen in

  • sources/2025-10-14-planetscale-benchmarking-postgres-17-vs-18 — canonical wiki introduction. Postgres 18's io_method=worker is both the default and, per PlanetScale's measured benchmarks, the best-performing io_method on most tested configurations. Dicken's take: "worker comes with a lot of the 'asynchronous' benefits of io_uring without relying on that specific kernel interface."
Last updated · 319 distilled / 1,201 read