Skip to content

CONCEPT Cited by 3 sources

Network-attached storage latency penalty

Definition

Network-attached storage latency penalty is the extra round-trip time a storage IO pays when the backing drive sits across a network hop from the compute instance, instead of on the same physical host. The underlying SSD is often the same part in both deployments; the penalty is entirely network round-trip + protocol stack + remote queueing.

Canonical numbers from Dicken (2025):

Path Round-trip
CPU ↔ locally-attached NVMe SSD ~50 μs
CPU ↔ network-attached SSD (EBS) ~250 μs

"Using the same cutting-edge SSD now takes an order of magnitude longer to fulfill individual read and write requests." (Source: sources/2025-03-13-planetscale-io-devices-and-latency)

Why the cloud paid the penalty

The trade is explicit:

  1. Instance-independent durability. Traditional servers lost their data with the server. Network-attached volumes survive an instance termination — a workload can re-mount onto a fresh VM.
  2. Elastic capacity. The volume can be resized without migrating data. Local NVMe is fixed-size at provisioning.
  3. Abstracted failure recovery. Hardware failure on the block-storage side is hidden; customer code sees only the volume API.

For stateless app servers, these properties outweigh the 5× latency hit because apps do most work in-memory. For OLTP databases, the trade flips: every transaction commit is an IO, and 5× to commit is 5× to user-visible tail latency.

Who pays the penalty

Industry defaults circa 2025:

  • Amazon RDS — EBS by default.
  • Amazon Aurora — custom distributed storage over network. Different trade than EBS but same network-round- trip topology.
  • Google Cloud SQL — network-attached.
  • PlanetScale (pre-Metal) — network-attached.

Direct-attached-NVMe databases exist but are the minority of managed-DB offerings. The pre-cloud default was direct attachment (SATA/SAS); managed services abandoned that for the elasticity above.

How AWS closed the gap (but didn't eliminate it)

The EBS team has spent over a decade attacking every queue in the path (systems/aws-ebs):

  • Nitro offload — moves EBS + VPC processing off the hypervisor CPU (systems/nitro).
  • SRD — out-of-order, multi-path storage transport replaces TCP (systems/srd).
  • Custom SSDs — Nitro SSDs tuned for the EBS workload (systems/aws-nitro-ssd).
  • io2 Block Express — sub-ms consistent latency on top- tier volumes.

The io2 figures approach but do not match direct-attached- NVMe, and io2 is the premium tier — most customers on GP2/GP3 are still in the 250 μs band Dicken's numbers cite.

Why it's the thesis for Metal

systems/planetscale-metal's pitch is that the two properties that drove the cloud to network storage (durability + elasticity) can be supplied without the network hop by:

  • Replication for durabilityconcepts/storage-replication-for-durability: 3× replicas with automated failover + frequent backups closes the "server dies → data dies" gap.
  • Scheduled resize via fresh instances — new node with larger drive, migrate data, retire old node. Not instant like EBS but does close the capacity-growth gap.

The result: direct-attached-NVMe latency (50 μs) with cloud-database-class durability and elasticity. See patterns/direct-attached-nvme-with-replication.

Seen in

  • sources/2025-03-13-planetscale-io-devices-and-latency — canonical 50 μs vs 250 μs framing and the argument for Metal.
  • sources/2024-08-22-allthingsdistributed-continuous-reinvention-block-storage-at-aws — AWS-side engineering story of shrinking the same gap.
  • sources/2025-10-14-planetscale-benchmarking-postgres-17-vs-18benchmark-measured empirical backing for the penalty. Dicken's Postgres 17 vs 18 sweep shows the local- NVMe i7i.2xlarge winning every tested configuration against three EBS variants on r7i.2xlarge. Postgres 18's new async-I/O modes (worker, io_uring) do not close the gap. The 5× latency penalty is load-bearing even on the most aggressive storage (io2-16k) — and the async-I/O machinery benefits less than expected on network-attached storage because the per-I/O latency floor dominates the scheduling-overhead reduction. Canonical wiki datum that the penalty is not a knob-twisting problem — it's a physics problem that async I/O can't engineer away.
Last updated · 319 distilled / 1,201 read