Skip to content

SYSTEM Cited by 1 source

Linux io_uring

What it is

io_uring is the Linux kernel's asynchronous I/O interface (merged in Linux 5.1, May 2019), designed to replace the earlier aio / libaio interface. It exposes two shared-memory ring buffers between user space and the kernel — a submission queue (SQ) and a completion queue (CQ) — so that I/O requests can be submitted and completed without a syscall per operation. A single process can have thousands of in-flight I/Os while avoiding both the per-operation syscall overhead and the synchronous blocking of read/write semantics.

Why it matters

Traditional POSIX read() / write() calls are synchronous — the calling thread blocks until the I/O completes. Applications that want concurrency use either:

  • Many threads, each blocking on its own I/O (thread-per-I/O, scales poorly past thousands).
  • Non-blocking I/O + epoll, which works for sockets but not for disk reads (Linux disk I/O has no fully non-blocking mode pre-io_uring).
  • POSIX aio, which Linux implements poorly — a user-space thread pool wrapping synchronous calls, not true kernel async.

io_uring is the first kernel-native async disk I/O on Linux. Applications submit batches of I/Os to the SQ, call io_uring_enter once, and later reap completions from the CQ — potentially with zero additional syscalls if submission- and completion-queue polling threads are configured.

How Postgres 18 uses it

Postgres 18's new io_method knob exposes io_uring as one of three options (alongside sync and worker, the new default). Setting io_method=io_uring causes Postgres to issue read requests via the io_uring interface, allowing the kernel to service multiple outstanding I/Os per process without thread-switching overhead.

The catch (from PlanetScale's benchmarks + Tomas Vondra's tuning blog):

  • Index scans don't yet use AIO. The B-tree-navigation paths — which dominate most OLTP — remain synchronous. io_uring's async-read benefit only applies to sequential / range scans.
  • Post-I/O work is still synchronous. Even when reads happen in the background, Postgres must checksum pages and memcpy them into the shared-buffer pool — these are CPU-bound and serial per-process.
  • Only reads are async in Postgres 18. Writes (including WAL fsync) still use synchronous paths. io_uring supports async writes, but Postgres 18 has not yet adopted them.

The practical result, per PlanetScale's data: io_uring only outperforms sync / worker at high concurrency + large range scans on local NVMe — the scenario where async I/O parallelism is load-bearing and the per-I/O latency floor doesn't dominate.

io_uring vs the worker pool

Postgres 18's io_method=worker — the new default — takes a different design path to the same goal: dedicated background worker processes handle I/O instead of using io_uring. See patterns/background-worker-pool-for-async-io. The trade-off:

  • io_uring has lower per-I/O overhead (no cross-process context switch) but requires a specific kernel interface and keeps post-I/O CPU work in the same process.
  • worker has higher per-I/O overhead (IPC + scheduling) but distributes CPU work across processes, which for many workloads matters more than shaving per-I/O latency.

Dicken's measured outcome: worker matches or beats io_uring on EBS-backed storage at all concurrency levels and only loses on local NVMe at 50 connections with large scans.

Applications using io_uring in production

  • Postgres 18 (2025, one of three io_method options).
  • RocksDB — optional io_uring-backed FileSystem for read-heavy workloads.
  • ScyllaDB — full replacement for POSIX I/O in the Seastar reactor.
  • QEMU — block-device I/O backend (aio=io_uring).
  • fio — the canonical disk benchmark has an io_uring engine.
  • liburing — the user-space helper library maintained by Jens Axboe (the kernel interface author).

Security posture

io_uring has had a checkered security history — multiple kernel CVEs since 2020 and, as of 2023, Google disabled io_uring on Android and ChromeOS, and several container runtimes block it via seccomp by default. Postgres 18's default (io_method=worker) sidesteps this by not requiring io_uring to be enabled.

Seen in

  • sources/2025-10-14-planetscale-benchmarking-postgres-17-vs-18 — canonical wiki introduction. Postgres 18's io_method=io_uring benchmarked against sync and worker; loses on EBS at low concurrency; only wins at 50 connections + --range_size=10000 on the local-NVMe i7i instance. Vondra's three architectural reasons for why worker beats io_uring on most workloads are cited and canonicalised.
Last updated · 319 distilled / 1,201 read