SYSTEM Cited by 1 source

iostat¶

What it is¶

iostat reports per-block-device I/O statistics on Linux. Ships in the sysstat package. It is the canonical tool for disk-side performance triage — the pivot target when vmstat's wa column or a high %iowait points at disk as the bottleneck.

Canonical invocation¶

iostat -xz 1

-x — extended statistics (the useful columns — await, avgqu-sz, %util).
-z — omit devices with no activity, so the output is scoped to what's actually busy.
1 — one-second samples.

Key output columns¶

Column	Meaning
`r/s`, `w/s`	Reads / writes per second (applied workload; utilisation primitives)
`rkB/s`, `wkB/s`	Read / write throughput
`await`	Average I/O completion time (queue + service) in ms
`r_await`, `w_await`	Split-direction await
`avgqu-sz`	Average queue depth
`svctm`	Average service time (unreliable on modern devices; `await` is the trustable one)
`%util`	Percent of time the device was doing work

Interpretation rules from the Netflix checklist¶

%util > 60% = usually hurts performance; ~100% = usually saturated — with a crucial caveat: "if the storage device is a logical disk device fronting many back-end disks, then 100% utilization may just mean that some I/O is being processed 100% of the time, however, the back-end disks may be far from saturated, and may be able to handle much more work." Applies to LVM, software RAID, and virtualised cloud block storage.
avgqu-sz > 1 often = saturation — with the same caveat; virtual devices may be serving many concurrent back-end requests.
await larger than expected = device saturation or device problems.
r/s, w/s, rkB/s, wkB/s = workload characterisation — what workload is actually applied? A performance problem "may simply be due to an excessive load applied."

The `%util` interpretation problem on modern devices¶

The %util column originated in an era of single-queue single-head HDDs where "busy" and "saturated" were the same thing. On modern NVMe SSDs that can service many commands concurrently, %util = 100% is the start of the useful range, not the end. Cloud block devices (EBS, Azure Disk, GCE PD) are multi-backend and even more decoupled from the %util signal. Netflix's framing is explicit — %util is a busy percent, not a saturation signal on its own.

Why async I/O changes the interpretation¶

Netflix's caveat on the whole disk-perf axis:

Bear in mind that poor performing disk I/O isn't necessarily an application issue. Many techniques are typically used to perform I/O asynchronously, so that the application doesn't block and suffer the latency directly (e.g., read-ahead for reads, and buffering for writes).

High await on a device does not automatically mean the application is suffering; the application may have buffered or batched the request out of its critical path.

Seen in¶

sources/2025-07-29-netflix-linux-performance-analysis-in-60-seconds — iostat -xz 1 is command #6 in the 60-second checklist. The follow-up tool when %iowait or low page cache points at disk.

concepts/io-wait · concepts/use-method
systems/vmstat — the tool whose wa column triggers the pivot to iostat.
systems/sysstat-package
patterns/sixty-second-performance-checklist