Skip to content

CONCEPT Cited by 1 source

Disk throughput bottleneck

Definition

Disk throughput bottleneck — the regime where a system's performance ceiling is set by how many bytes per second the underlying storage device can read or write, not CPU, not network, not memory. Manifests as saturated IOPS / MB-s at the device level + queue-depth growth at the OS level + a flat response-time floor that doesn't budge when you add client concurrency.

Distinct from:

  • Working-set-vs-cache bottleneck (see concepts/wiredtiger-cache) — symptom is rising cache-miss rate + unchanged disk throughput but disk latency matters more because it's on the query path. Fix is shrinking working set or adding RAM.
  • CPU bottleneck — all cores pinned, disk idle. Fix is algorithmic or horizontal scaling.
  • Network-bandwidth / round-trip bottleneck — see concepts/network-round-trip-cost. Fix is collapse-round- trips patterns (bulk writes, pipelining).

The primary diagnostic signal is disk-utilization % or %util in iostat — sustained at or near 100 % is the hallmark.

Standard levers (in order of usual preference)

  1. Shrink what lands on disk. Tighter schema, better compression, fewer secondary indexes, shorter field names, value-type tightening (string → integer, integer → enum, ISO string → Date). MongoDB's "Cost of Not Knowing" series Part 1 + 2 walk through exactly this sequence.
  2. Batch writes to amortize fsync / journal cost. Bulk writes collapse N disk-queue entries to one; MongoDB's bulkWrite is the canonical primitive (see patterns/bulk-write-batch-optimization).
  3. Shift to a lower-cost-per-byte compression. zstd for strongly compressible data; trades CPU for disk bytes (see concepts/document-storage-compression). Only works if there's free CPU.
  4. Harder: move hot data to faster storage. NVMe vs SATA SSD is an order of magnitude; local NVMe vs EBS is another. Provisioned IOPS tiers on cloud block storage.
  5. Last resort: shard horizontally. Spread the throughput requirement across multiple servers' disks. Expensive — changes the system topology.

Bottleneck migration

The disk-throughput regime is often transient. Shrinking documents or narrowing the schema can push the bottleneck off disk entirely, revealing the next-slowest resource:

  • Document-size shrink → often hits index-in-cache next.
  • Bulk-write collapse → often hits CPU on aggregation / query parsing, or client-server round-trip if the client is far.
  • Compression upgrade → hits CPU decompression overhead on the read path.

The MongoDB Cost of Not Knowing Part 3 case study is the canonical wiki instance: appV5R4's bottleneck was disk throughput → appV6R0 shrank documents 67.5 % and the bottleneck moved to index > cache → appV6R1 traded fewer documents / slightly larger docs for a smaller index, which fit in cache. Each iteration dislodged the previous dominant cost. This is the pattern captured in patterns/schema-iteration-via-load-testing.

Caveats

  • The "right" bottleneck depends on the hardware envelope. A 4 GB-RAM dev machine puts the index-in-cache boundary at 1.5 GB; a 256 GB production machine puts it at ~127 GB. Tuning on the smaller envelope surfaces trade-offs that are invisible on the larger one — useful for learning, dangerous for copy-pasting conclusions to production without re-testing.
  • Disk throughput has asymmetric directions. Reads (random vs sequential, cached vs uncached) and writes (journaled, acknowledged, fsync'd) have very different per-IO costs. A database may be read-throughput-bottlenecked but write-cache- bottlenecked — the single "disk throughput" label hides this.
  • Cloud storage adds queueing. Provisioned-IOPS EBS volumes impose an IOPS cap separate from the raw-device ceiling; exceeding it throttles rather than saturates, which looks like constant latency instead of growing queue depth.

Seen in

Last updated · 200 distilled / 1,178 read