CONCEPT Cited by 2 sources

Network-attached storage latency penalty¶

Definition¶

Network-attached storage latency penalty is the extra round-trip time a storage IO pays when the backing drive sits across a network hop from the compute instance, instead of on the same physical host. The underlying SSD is often the same part in both deployments; the penalty is entirely network round-trip + protocol stack + remote queueing.

Canonical numbers from Dicken (2025):

Path	Round-trip
CPU ↔ locally-attached NVMe SSD	~50 μs
CPU ↔ network-attached SSD (EBS)	~250 μs

"Using the same cutting-edge SSD now takes an order of magnitude longer to fulfill individual read and write requests." (Source: sources/2025-03-13-planetscale-io-devices-and-latency)

Why the cloud paid the penalty¶

The trade is explicit:

Instance-independent durability. Traditional servers lost their data with the server. Network-attached volumes survive an instance termination — a workload can re-mount onto a fresh VM.
Elastic capacity. The volume can be resized without migrating data. Local NVMe is fixed-size at provisioning.
Abstracted failure recovery. Hardware failure on the block-storage side is hidden; customer code sees only the volume API.

For stateless app servers, these properties outweigh the 5× latency hit because apps do most work in-memory. For OLTP databases, the trade flips: every transaction commit is an IO, and 5× to commit is 5× to user-visible tail latency.

Who pays the penalty¶

Industry defaults circa 2025:

Amazon RDS — EBS by default.
Amazon Aurora — custom distributed storage over network. Different trade than EBS but same network-round- trip topology.
Google Cloud SQL — network-attached.
PlanetScale (pre-Metal) — network-attached.

Direct-attached-NVMe databases exist but are the minority of managed-DB offerings. The pre-cloud default was direct attachment (SATA/SAS); managed services abandoned that for the elasticity above.

How AWS closed the gap (but didn't eliminate it)¶

The EBS team has spent over a decade attacking every queue in the path (systems/aws-ebs):

Nitro offload — moves EBS + VPC processing off the hypervisor CPU (systems/nitro).
SRD — out-of-order, multi-path storage transport replaces TCP (systems/srd).
Custom SSDs — Nitro SSDs tuned for the EBS workload (systems/aws-nitro-ssd).
io2 Block Express — sub-ms consistent latency on top- tier volumes.

The io2 figures approach but do not match direct-attached- NVMe, and io2 is the premium tier — most customers on GP2/GP3 are still in the 250 μs band Dicken's numbers cite.

Why it's the thesis for Metal¶

systems/planetscale-metal's pitch is that the two properties that drove the cloud to network storage (durability + elasticity) can be supplied without the network hop by:

Replication for durability — concepts/storage-replication-for-durability: 3× replicas with automated failover + frequent backups closes the "server dies → data dies" gap.
Scheduled resize via fresh instances — new node with larger drive, migrate data, retire old node. Not instant like EBS but does close the capacity-growth gap.

The result: direct-attached-NVMe latency (50 μs) with cloud-database-class durability and elasticity. See patterns/direct-attached-nvme-with-replication.

Seen in¶

— Canonical "physics, not engineering" framing of the penalty. Richard Crowley (PlanetScale, 2025-03-11): "No amount of engineering from the cloud providers can mask the physical reality that network-attached storage is very far away." Canonical wiki statement that the penalty is structural — the latency is paid on the NIC + network gear + remote-machine path, independent of how tightly the cloud provider engineers each hop. Canonical application-visible p99 impact datum: million-QPS production workload, EBS ~1ms I/O latency → Metal μs-scale, p99 query latency dropped from 9ms to 4ms by flipping the storage substrate alone (no schema change, no query rewrite, no topology change). First wiki canonical statement that storage-latency penalty propagates all the way to application p99 — not just to disk-level p99. Canonical paired instance-type datum: r6i.4xlarge + EBS = 40k IOPS vs i4i.4xlarge + local NVMe = 220k write / 400k read IOPS — the same vCPU class paired with different storage substrates spans a 5-10× IOPS gap.
sources/2025-03-13-planetscale-io-devices-and-latency — canonical 50 μs vs 250 μs framing and the argument for Metal.
sources/2024-08-22-allthingsdistributed-continuous-reinvention-block-storage-at-aws — AWS-side engineering story of shrinking the same gap.
— benchmark-measured empirical backing for the penalty. Dicken's Postgres 17 vs 18 sweep shows the local- NVMe i7i.2xlarge winning every tested configuration against three EBS variants on r7i.2xlarge. Postgres 18's new async-I/O modes (worker, io_uring) do not close the gap. The 5× latency penalty is load-bearing even on the most aggressive storage (io2-16k) — and the async-I/O machinery benefits less than expected on network-attached storage because the per-I/O latency floor dominates the scheduling-overhead reduction. Canonical wiki datum that the penalty is not a knob-twisting problem — it's a physics problem that async I/O can't engineer away.