CONCEPT Cited by 3 sources
Network-attached storage latency penalty¶
Definition¶
Network-attached storage latency penalty is the extra round-trip time a storage IO pays when the backing drive sits across a network hop from the compute instance, instead of on the same physical host. The underlying SSD is often the same part in both deployments; the penalty is entirely network round-trip + protocol stack + remote queueing.
Canonical numbers from Dicken (2025):
| Path | Round-trip |
|---|---|
| CPU ↔ locally-attached NVMe SSD | ~50 μs |
| CPU ↔ network-attached SSD (EBS) | ~250 μs |
"Using the same cutting-edge SSD now takes an order of magnitude longer to fulfill individual read and write requests." (Source: sources/2025-03-13-planetscale-io-devices-and-latency)
Why the cloud paid the penalty¶
The trade is explicit:
- Instance-independent durability. Traditional servers lost their data with the server. Network-attached volumes survive an instance termination — a workload can re-mount onto a fresh VM.
- Elastic capacity. The volume can be resized without migrating data. Local NVMe is fixed-size at provisioning.
- Abstracted failure recovery. Hardware failure on the block-storage side is hidden; customer code sees only the volume API.
For stateless app servers, these properties outweigh the 5× latency hit because apps do most work in-memory. For OLTP databases, the trade flips: every transaction commit is an IO, and 5× to commit is 5× to user-visible tail latency.
Who pays the penalty¶
Industry defaults circa 2025:
- Amazon RDS — EBS by default.
- Amazon Aurora — custom distributed storage over network. Different trade than EBS but same network-round- trip topology.
- Google Cloud SQL — network-attached.
- PlanetScale (pre-Metal) — network-attached.
Direct-attached-NVMe databases exist but are the minority of managed-DB offerings. The pre-cloud default was direct attachment (SATA/SAS); managed services abandoned that for the elasticity above.
How AWS closed the gap (but didn't eliminate it)¶
The EBS team has spent over a decade attacking every queue in the path (systems/aws-ebs):
- Nitro offload — moves EBS + VPC processing off the hypervisor CPU (systems/nitro).
- SRD — out-of-order, multi-path storage transport replaces TCP (systems/srd).
- Custom SSDs — Nitro SSDs tuned for the EBS workload (systems/aws-nitro-ssd).
- io2 Block Express — sub-ms consistent latency on top- tier volumes.
The io2 figures approach but do not match direct-attached- NVMe, and io2 is the premium tier — most customers on GP2/GP3 are still in the 250 μs band Dicken's numbers cite.
Why it's the thesis for Metal¶
systems/planetscale-metal's pitch is that the two properties that drove the cloud to network storage (durability + elasticity) can be supplied without the network hop by:
- Replication for durability — concepts/storage-replication-for-durability: 3× replicas with automated failover + frequent backups closes the "server dies → data dies" gap.
- Scheduled resize via fresh instances — new node with larger drive, migrate data, retire old node. Not instant like EBS but does close the capacity-growth gap.
The result: direct-attached-NVMe latency (50 μs) with cloud-database-class durability and elasticity. See patterns/direct-attached-nvme-with-replication.
Seen in¶
- sources/2025-03-13-planetscale-io-devices-and-latency — canonical 50 μs vs 250 μs framing and the argument for Metal.
- sources/2024-08-22-allthingsdistributed-continuous-reinvention-block-storage-at-aws — AWS-side engineering story of shrinking the same gap.
- sources/2025-10-14-planetscale-benchmarking-postgres-17-vs-18
— benchmark-measured empirical backing for the
penalty. Dicken's Postgres 17 vs 18 sweep shows the local-
NVMe
i7i.2xlargewinning every tested configuration against three EBS variants onr7i.2xlarge. Postgres 18's new async-I/O modes (worker,io_uring) do not close the gap. The 5× latency penalty is load-bearing even on the most aggressive storage (io2-16k) — and the async-I/O machinery benefits less than expected on network-attached storage because the per-I/O latency floor dominates the scheduling-overhead reduction. Canonical wiki datum that the penalty is not a knob-twisting problem — it's a physics problem that async I/O can't engineer away.