Skip to content

CONCEPT Cited by 1 source

Async I/O concurrency threshold

Definition

Async I/O concurrency threshold is the observation that asynchronous-I/O interfaces (like Linux's io_uring) only outperform synchronous I/O above a certain concurrency / I/O-rate level. Below that threshold, the overhead of async submission + completion tracking exceeds the latency-hiding benefit — synchronous I/O wins because it has a shorter code path on the hot path for a single I/O.

At low concurrency (few in-flight I/Os), there's nothing for async I/O to hide — the caller is waiting for one request at a time anyway, and sync I/O's simpler code path is faster.

At high concurrency (many in-flight I/Os), async I/O hides per-request latency behind the next request, and the raw throughput is bound by the storage device's parallelism (multiple NAND targets, multiple queue depth).

Why the threshold exists

The per-I/O cost decomposes into:

  • Submission overhead (set up the request, enqueue it).
  • Latency floor (the physics — NAND read, network RTT, seek).
  • Completion overhead (reap the result, dispatch downstream).

Sync I/O has low submission + completion overhead but serialises the latency floor — one I/O at a time from the caller's perspective.

Async I/O adds submission + completion overhead but amortises the latency floor across many concurrent requests.

Below the threshold, sync_overhead + latency < async_overhead + (latency / in_flight) — sync wins. Above, async wins.

Canonical wiki instance (PlanetScale 2025-10-14)

Ben Dicken's Postgres 17 vs Postgres 18 benchmark (sources/2025-10-14-planetscale-benchmarking-postgres-17-vs-18) provides the canonical empirical observation. Testing Postgres 18 with io_method set to sync, worker, and io_uring across 1 / 10 / 50 connections on EBS and local NVMe:

  • At 1 connection on EBS, io_uring loses to sync and worker. Surprising result: "I'll admit, this surprised me! My expectation was that io_uring would perform as well as if not better than all these options."
  • At 10 connections on gp3-3k, io_uring is significantly worse than the other options.
  • At 50 connections on gp3-3k, io_uring is only slightly worse than the other options — the gap narrows.
  • At 50 connections on local NVMe, io_uring slightly beats the other options — the threshold is finally crossed.

Dicken's explicit formulation: "io_uring performs well when there's lots of I/O concurrency, but in low-concurrency scenarios it isn't as beneficial."

Additional factors on the threshold

  • Storage latency floor. On network-attached storage (~250 μs round-trip), the latency floor dominates so thoroughly that async-I/O concurrency-hiding doesn't help much. On local NVMe (~50 μs), the floor is low enough that async parallelism matters.
  • Post-I/O CPU work. If the caller is CPU-bound after the I/O completes (checksums, memcpy, decompression), then async I/O's latency-hiding is upper-bounded by per-process CPU saturation. This is why Postgres 18 ships with io_method=worker as default, not io_uringworker spreads the post-I/O CPU across processes too.
  • Workload shape. Point selects issue one I/O at a time and therefore sit below the threshold regardless of connection count. Range scans issue many I/Os per query and can cross the threshold at modest connection counts.

Implications for system design

  • Don't assume io_uring is always faster. Benchmark the specific workload + concurrency + storage combination.
  • The io_method=worker hybrid is a deliberate middle ground. Farming I/O out to worker processes distributes both the I/O submission and the post-I/O CPU work; benefits at concurrency levels where io_uring still costs more than it saves.
  • Storage latency dominates at low concurrency. On network-attached storage, shave the latency floor (direct-attached NVMe) before reaching for async-I/O knobs.
  • Applications need to know their concurrency regime. OLTP backends servicing bursty single-connection traffic are below the threshold. Batch / analytics / streaming workloads tend to be above it.

Seen in

Last updated · 319 distilled / 1,201 read