CONCEPT Cited by 2 sources

IOPS (Input/Output Operations Per Second)¶

Definition¶

IOPS — Input/Output Operations Per Second — is the rate at which a storage device performs read or write operations against its backing media. It is one of the two independent capacity dimensions for any storage system; the other is throughput (bytes moved per second).

Ben Dicken's canonical framing:

"IOPS is shorthand for Input/Output Operations Per Second. In other words, how many times per-second does the system perform a read or write operation on the underlying storage volume."

(Source: sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding)

Why "one operation" is storage-system-specific¶

There is no universal definition of "one operation". Every storage system fixes a per-IOP block size, and that size determines how the workload's byte-level demand translates into IOPS budget.

AWS gp3 EBS: one IOP = one 64 KiB disk read or write.
AWS gp2 EBS: one IOP ≤ 256 KiB per operation (pooled model).
AWS io2 Block Express: similar 64 KiB block accounting.
Direct-attached NVMe: hardware-limited, no administrative IOP definition — the drive's command queue accepts I/Os at whatever size the OS submits.

Dicken verbatim:

"But what counts as a single operation? This depends on the cloud provider and storage system being used. … If we choose to use gp3, a single operation is measured as a one 64 KiB disk read or write."

The naive throughput formula (only works for sequential I/O)¶

For purely sequential I/O under the gp3 block size:

max_throughput = IOPS × 64 KiB

A gp3 volume with the default 3,000 IOPS would compute to 64 × 3,000 = 192 MiB/s — but two factors break the naive formula:

Default throughput cap. gp3 has a 125 MiB/s default throughput limit independent of IOPS; the cap kicks in before the 192 MiB/s ceiling.
Random I/O penalty. Random reads count as a full IOP regardless of bytes requested. A 4 KiB random read still burns one full 64 KiB IOP. See concepts/sequential-vs-random-io.

So the naive IOPS × block_size formula is the upper bound on throughput; real workloads operate under it.

Why IOPS matters for databases¶

OLTP databases issue many small I/Os:

Writes: each commit fsyncs the WAL (one or more pages); each index update modifies one or more pages.
Reads: point lookups on unindexed access paths become random I/O on the primary; buffer-pool misses fan out to disk per page.

At the gp3 default of 3,000 IOPS, a database doing ~1,500 write transactions/second with ~2 page writes per transaction is at the IOPS cap before any read I/O. Production workloads routinely need 10,000-100,000+ IOPS — which means upgrading to provisioned-IOPS volumes (io1 / io2) at significantly higher cost, or sharding to spread the demand across multiple volumes.

Provisioned vs burst IOPS¶

Two models exist for exposing IOPS to a customer:

Provisioned IOPS (gp3, io1, io2): a fixed IOPS cap per volume; paying more raises the cap. gp3 goes from 3k default to 16k max; io2 goes up to 256k per volume.
Burst IOPS (gp2): IOPS tied to volume size (3× GiB per IOP) plus a burst bucket that accumulates credit while the volume is below its baseline and spends credit during bursts above it. Once the bucket empties, the volume falls back to the baseline.

The provisioned model is more predictable under sustained load; the burst model is cheaper for workloads that are genuinely bursty (ad-hoc reporting, dev/test environments).

Why it's a configurable cap, not a hardware limit¶

On network-attached block storage (EBS, GCP Persistent Disk, Azure Managed Disk), IOPS is administratively capped because the backing storage fleet is shared across many customers. A single noisy-neighbour issuing at the underlying SSD's hardware limit would starve every other volume on the same server. IOPS caps are the isolation mechanism the shared fleet uses to honour per-volume SLAs. See concepts/iops-throttle-network-storage for the structural framing; concepts/noisy-neighbor + concepts/performance-isolation for the underlying problem.

Direct-attached NVMe has no such cap — the hardware limit is the limit. See systems/nvme-ssd + systems/planetscale-metal + patterns/direct-attached-nvme-with-replication.

Seen in¶

— Canonical read/write-asymmetric NVMe IOPS datum. Richard Crowley (PlanetScale, 2025-03-11) canonicalises that local-NVMe IOPS ceilings are not symmetric between reads and writes: "an i4i.4xlarge EC2 instance can perform 220,000 random write or 400,000 random read IOPS using local NVMe SSDs" — reads are ~1.8× faster than writes on the same hardware because writes incur NAND erase + program cycles while reads are pure read operations. First wiki canonical statement of NVMe IOPS read/write asymmetry. Paired citation that r6i.4xlarge + EBS caps at 40,000 IOPS — the 5.5× write / 10× read IOPS ratio at the same vCPU class is the single-paragraph argument for Metal's IOPS headroom over EBS. Canonical real-world production ceiling datum: million-QPS workload migration from EBS (~1ms I/O latency) to Metal (μs-scale) reduced p99 query latency from 9ms to 4ms — canonical wiki datum that application-visible p99 tracks I/O-subsystem IOPS headroom directly.
sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding — Ben Dicken canonicalises the gp3 single-IOP semantics (one 64 KiB disk read or write), the naive-formula upper bound, the random-I/O penalty, the independence of IOPS and throughput capacity, and IOPS as a first-class database-sizing parameter alongside vCPU / RAM / storage. Small-DB baseline assumes 3,000 default IOPS suffices; 8×-scale target (4 TB / 24,000 IOPS / 1,000 MiB/s per primary) triggers the upgrade to io1 / io2 provisioned-IOPS tier at 11-13× cost, motivating patterns/sharding-as-iops-scaling as the cost-mitigation.
sources/2025-03-13-planetscale-io-devices-and-latency — Ben Dicken's earlier pedagogical post on storage-device physics + the gp3 default-IOPS-cap + GP2 burst-bucket framing. Teaching numbers for local-NVMe vs EBS latency (~50 μs vs ~250 μs) complement this post's cost numbers.