Skip to content

CONCEPT Cited by 1 source

Postgres async I/O

Definition

Postgres async I/O is the feature introduced in Postgres 18 (September 2025) that allows the database to issue asynchronous read I/Os to the operating system, rather than issuing every read synchronously and blocking the backend process until it completes. The feature is exposed via the new io_method configuration option with three values:

io_method Behaviour
sync All reads issued synchronously. Matches Postgres 17 and earlier.
worker New default. Dedicated background worker processes handle I/O. The calling backend submits a request and waits on a shared-memory response.
io_uring Reads issued via Linux's io_uring kernel interface.

Only io_method=sync does what every previous Postgres version did.

Why Postgres added async I/O

The motivation is the widening gap between CPU speed and storage latency in modern cloud deployments. Synchronous I/O forces each backend process to spend round-trip time waiting on a single I/O at a time — even on local NVMe (~50 μs) and dramatically more on network-attached storage (~250 μs). A backend waiting on 1000 synchronous reads at 250 μs each spends 250 ms in pure I/O wait. Async I/O lets the same backend issue many reads in parallel and absorb the latency.

What's actually async in Postgres 18

Dicken + Vondra's tuning post are specific about the current (18.x) scope:

  • Reads only. Writes, including WAL fsync, still use synchronous paths.
  • Sequential and bitmap heap scans use AIO.
  • Index scans do NOT yet use AIO — B-tree navigation remains synchronous. This matters because indexed lookups are the dominant shape of OLTP.
  • Post-read work is synchronous. Even when the OS completes a read asynchronously, Postgres must checksum the page and memcpy it into the shared-buffer pool; these are CPU-serial per-process.

Why worker is the default (not io_uring)

The intuitive expectation — "io_uring is kernel-native async and therefore must be fastest" — is not what the Postgres committers or PlanetScale's benchmarks found:

  • worker distributes CPU across processes. Each worker process can execute the checksum + memcpy path independently. io_uring keeps all of that in the calling process.
  • worker doesn't require a specific kernel interface. io_uring is Linux-only and has a difficult security history (disabled in some hardened contexts). worker works anywhere Postgres works.
  • worker is tuneable at the application layer. io_workers=N (default 3) scales the pool; io_uring tuning requires kernel- side parameters.
  • io_uring's per-I/O wins are dominated by other floors. The per-I/O latency floor on network-attached storage is the network hop (~250 μs); the per-I/O latency floor on local NVMe is the NAND read itself (~50 μs). io_uring reduces kernel overhead that was already a small share of the round-trip.

See patterns/background-worker-pool-for-async-io for the generalisation.

Measured behaviour (PlanetScale 2025-10-14)

Across 96 oltp_read_only runs on Postgres 17 + 18's three modes:

Scenario Winner
Single connection, --range_size=100, EBS gp3/io2 Postgres 18 sync / worker (beat 17 + io_uring)
Single connection, --range_size=100, local NVMe Postgres 18 roughly tied across modes
50 connections, --range_size=100, EBS IOPS-cap-limited; modest 18-over-17 win
50 connections, --range_size=10000, local NVMe io_uring wins slightly
10 connections, --range_size=100, gp3-3k io_uring significantly worse

Dicken's takeaway: "io_uring performs well when there's lots of I/O concurrency, but in low-concurrency scenarios it isn't as beneficial." Canonical worked-example of concepts/async-io-concurrency-threshold.

Tuning surface

Key knobs exposed by Postgres 18:

  • io_method = sync | worker | io_uring
  • io_workers — number of worker processes (default 3); relevant only when io_method=worker.
  • effective_io_concurrency — historically a hint to prefetch under posix_fadvise; Postgres 18 uses it to size the per-backend AIO queue.
  • max_parallel_workers — orthogonal; these parallelise query execution, not I/O.

Vondra's post walks the interaction surface in detail.

Why it's not a free upgrade

  • Read-only benefit. No write-path improvement in 18.x.
  • Index-path bypass. B-tree-dominated OLTP workloads don't see much benefit.
  • Post-I/O CPU ceiling. Bulk-scan workloads hit checksum / memcpy walls regardless of how fast the I/O comes back.
  • Tail-latency comparison not reported. PlanetScale's benchmarks report average QPS; whether io_uring shifts p99 / p99.9 favourably in any scenario is not disclosed.
  • Requires workload fit. Workloads dominated by indexed lookups on a warm buffer pool (most production OLTP) see little difference.

Seen in

Last updated · 319 distilled / 1,201 read