Skip to content

CONCEPT Cited by 1 source

Throughput vs IOPS

Definition

Throughput and IOPS are the two independent capacity dimensions of any storage system, and cloud providers provision and price them separately.

  • IOPS = number of operations per second (operations-count dimension).
  • Throughput = bytes moved per second (bandwidth dimension).

The naive intuition "if I know IOPS and the block size, I know throughput" is correct only in the sequential I/O regime. In practice, both dimensions have independent caps and independent pricing, and a workload can hit the throughput ceiling before the IOPS ceiling (or vice versa).

Dicken's framing

"Throughput is the total amount of data or requests than can move through a system over a defined span of time. Though throughput is related to IOPS, a volume's IOPS does not directly translate to a volume's throughput delivery at any given time."

(Source: sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding)

The gp3 worked example

Default gp3 EBS volume limits:

Dimension Default Max (configurable)
IOPS 3,000 16,000
Throughput 125 MiB/s 1,000 MiB/s

With perfectly sequential I/O at 64 KiB per IOP: 3,000 × 64 KiB = 192 MiB/s. But the 125 MiB/s default throughput cap sits below that theoretical ceiling. Dicken verbatim:

"based on the calculation from earlier, if we utilize our 3000 IOPS with perfectly sequential IO patterns, we could theoretically achieve 192 MiB/s. However, due to the default throughput limit of 125 MiB/s, we cannot actually reach this unless we purchase more throughput."

So on a default gp3, the throughput dimension is the binding constraint for sequential-heavy workloads; the IOPS dimension is the binding constraint for small-random-I/O-heavy workloads.

Why cloud providers cap them separately

Network-attached block storage runs on a shared fleet of storage servers connected over a shared network fabric. The two caps protect two different shared resources:

  • IOPS cap protects the storage-server CPU and SSD command queue — each operation is metadata + media access.
  • Throughput cap protects the network bandwidth between the EC2 instance and the storage server — each byte traverses shared wire.

A workload that issued at hardware-limit IOPS with large block sizes would saturate the network; a workload that issued at hardware-limit throughput with tiny block sizes would saturate the storage fleet's CPU. Independent caps let the provider provision each resource separately.

Why databases bind on throughput

OLTP databases typically bind on IOPS because writes are small-block (page-level); analytics and full-table-scan workloads typically bind on throughput because they stream large contiguous blocks.

Under the default gp3 configuration:

  • A database fsyncing 1,500 WAL writes/second with 2 pages per commit (~24 KiB each) = 3,000 × 24 KiB = 72 MiB/s: under the 125 MiB/s throughput cap, but at the 3,000 IOPS cap.
  • A full-table-scan streaming 125 MiB/s at 64 KiB per IOP = 2,000 IOPS: under the 3,000 IOPS cap, but at the 125 MiB/s throughput cap.

Both workloads need to upgrade to higher ceilings — but for different reasons. The OLTP workload pays for provisioned IOPS; the analytics workload pays for provisioned throughput.

Typical workload regimes

Workload IOPS-bound Throughput-bound
OLTP (point lookups, small commits)
Backup / restore (large sequential)
Analytics table scan
Index-heavy read traffic (random I/O)
WAL shipping / replication ✔ (depends on throughput)
Log ingestion (append-heavy)

Architectural implications

  • Provision both dimensions. Sizing a database volume to IOPS alone ("we need 10k IOPS") without sizing throughput (at 64 KiB per IOP, 10k IOPS = 640 MiB/s — above the 125 MiB/s default) produces a surprise cap in production.
  • Horizontal sharding halves both. Each shard carries 1/N the IOPS demand and 1/N the throughput demand. Provisioned IOPS + throughput premium tiers become avoidable. See patterns/sharding-as-iops-scaling.
  • Direct-attached NVMe bypasses both caps. Local NVMe is bound only by the drive's hardware limits, which are orders of magnitude higher on both dimensions. See patterns/direct-attached-nvme-with-replication + systems/planetscale-metal.

Seen in

  • sources/2026-04-21-planetscale-increase-iops-and-throughput-with-sharding — Ben Dicken's canonical framing: IOPS and throughput as separate capacity dimensions with independent gp3 defaults (3k IOPS + 125 MiB/s) and independent max ceilings (16k + 1,000 MiB/s on gp3; 256k + 4,000 MiB/s on io2). The 192 MiB/s-theoretical-but-capped-at-125 MiB/s worked example canonicalises that the two dimensions are not derivable from each other via block-size arithmetic.
Last updated · 378 distilled / 1,213 read