PATTERN Cited by 1 source
Background worker pool for async I/O¶
Problem¶
You want an application process to stop blocking on synchronous
I/O, but the obvious answer (io_uring, libaio) has real
downsides:
- Kernel-interface-specific.
io_uringis Linux-only and has a difficult security history — disabled by default in hardened sandbox / container contexts. - Keeps post-I/O CPU work in-process. If the caller must
checksum / decompress /
memcpyeach buffer after read, that CPU cost is serialised per calling process, capping the benefit of async I/O. - Per-I/O overhead doesn't help at low concurrency. See concepts/async-io-concurrency-threshold.
- Doesn't distribute across CPU cores automatically. A single calling process issuing async I/Os still runs post-I/O work on one core.
Pattern¶
Run a pool of dedicated background worker processes (or threads, depending on runtime) that perform I/O on behalf of the main application processes. When a backend needs data, it submits a request to the worker pool via shared memory / IPC and waits on a response. Workers pick up requests, perform the underlying synchronous I/O, and signal completion.
The key properties:
- Worker-side I/O can be simple
read()/write(), no special kernel interface required. - Worker processes run on different CPU cores. Post-I/O CPU work (checksums, memcpy, decompression) distributes across the pool naturally.
- Worker count is tuneable at runtime via configuration.
- Single-process API — the backend sees a future / promise shape even though the underlying mechanism is cross-process IPC, not in-process async.
Canonical wiki instance: Postgres 18 io_method=worker¶
Postgres 18 (September 2025) introduced the
io_method option with three
settings. worker was chosen as the new default — not
io_uring, the more headline-grabbing option.
Dicken's framing in sources/2025-10-14-planetscale-benchmarking-postgres-17-vs-18:
Using
io_method=workerwas a good choice as the new default. It comes with a lot of the "asynchronous" benefits ofio_uringwithout relying on that specific kernel interface, and can be tuned by settingio_workers=X.
Postgres's io_workers defaults to 3. Each worker process receives
I/O requests from backend processes via shared memory, issues the
underlying OS read, and signals completion. From the backend's
perspective the interface is async; from the OS's perspective each
worker issues plain synchronous reads.
Why worker beats io_uring on Postgres's workload shape¶
Per Tomas Vondra's tuning blog cited by Dicken:
- Index scans don't yet use AIO in Postgres 18.x, so B-tree- dominated OLTP workloads benefit from whatever the I/O method happens to be doing for sequential and bitmap scans — which is most benchmark-load.
- Checksums + memcpy are CPU-bound and serial per Postgres
backend.
io_uring's async dispatch doesn't help because the backend is CPU-bound, not I/O-bound, after the read completes.workerdistributes the CPU cost across worker processes. - Process-level parallelism is what Postgres needs. Postgres's process-per-backend architecture already uses separate processes for isolation; adding I/O workers is a natural extension.
PlanetScale's measured data: worker matches or beats io_uring
on EBS-backed instances at all tested concurrency levels, and only
loses on local NVMe at 50 connections with large range scans.
Variants¶
- Thread pool instead of process pool. Runtimes without cheap forking (Java, Go, Rust) prefer a thread pool; the dispatch mechanism becomes in-process queue rather than shared-memory IPC.
- Per-device worker pool. One worker per underlying storage device to avoid head-of-line blocking across devices.
- Mixed modes. Some reads go through the worker pool, others
stay synchronous;
io_uringbackfills at a third tier for high-concurrency regimes where the worker overhead starts to cap throughput.
Trade-offs¶
- Pro: kernel-interface-agnostic. Works wherever synchronous
read()works. - Pro: CPU distribution. Post-I/O work parallelises across worker CPUs.
- Pro: tuneable pool size. Application-layer knob, not kernel parameter.
- Pro: well-understood failure modes. Workers die like any other process and can be restarted.
- Con: higher per-I/O overhead. IPC + scheduling cost per
request. At very high I/O rates this exceeds
io_uring's submission overhead. - Con: fixed worker count is a bottleneck. Too few workers = queueing; too many = context-switch overhead.
- Con: cross-process memory copy. Data read by a worker must be delivered to the backend — typically via shared memory, but still not zero-cost.
When to use this pattern¶
- The calling process is CPU-bound post-I/O. Checksumming, decompression, format parsing — CPU work parallelises across workers.
- You need portability across kernel versions / operating systems.
- You're already using a process-per-client architecture (Postgres, Apache prefork).
- Your concurrency sits below
io_uring's payoff threshold (see concepts/async-io-concurrency-threshold).
When not to use it¶
- Ultra-high I/O rates where IPC overhead dominates (millions of IOPS per calling process). Go kernel-native async.
- CPU-light workloads where the post-I/O path is nearly free and per-I/O overhead is the whole story.
- Single-threaded event-loop architectures (Node.js, Redis) that would lose their simplicity gains from adding worker processes.
Seen in¶
- sources/2025-10-14-planetscale-benchmarking-postgres-17-vs-18
— canonical wiki introduction. Postgres 18's
io_method=workeris both the default and, per PlanetScale's measured benchmarks, the best-performingio_methodon most tested configurations. Dicken's take: "workercomes with a lot of the 'asynchronous' benefits ofio_uringwithout relying on that specific kernel interface."