CONCEPT Cited by 2 sources
Performance isolation¶
Definition¶
Performance isolation is the property that one tenant's workload does not observably affect another tenant's latency or throughput — distinct from simple fairness (everyone gets some share) and from load balancing (the total work is spread across resources). It is the strong form of the answer to concepts/noisy-neighbor: not "the worst case is rarer," but "your tail is not a function of my behavior."
Why it matters¶
From the EBS post:
As AWS evolved, we learned that we had to focus ruthlessly on a high-quality customer experience, and that inevitably meant that we needed to achieve strong performance isolation to avoid noisy neighbors causing interference with other customer workloads.
Performance isolation is what converts a multi-tenant storage service from "mostly fine on average" into something safe to host a customer's mission-critical application on. It's the property Provisioned IOPS was created to sell in 2012.
How EBS achieves it (as of 2024)¶
Across layers, each of these contributes a piece:
- SSDs — collapse per-operation variance at the media level.
- systems/nitro offload — remove hypervisor queue depth coupling between tenants; stop stealing customer CPU for IO/encryption.
- Hardware-accelerated EBS encryption — at line rate; key material isolated from the hypervisor. No shared-CPU tax between tenants for crypto.
- systems/srd — multi-path, out-of-order, offload-friendly transport removes TCP's head-of-line variance.
- Instrumentation + canaries (patterns/full-stack-instrumentation) — so any regression of isolation is caught as it lands.
- Custom systems/aws-nitro-ssd — co-designed so media variance is known, not inherited from an off-the-shelf SSD controller's firmware choices.
Seen in¶
- sources/2024-08-22-allthingsdistributed-continuous-reinvention-block-storage-at-aws — Olson frames 15 years of EBS performance work as "ruthless focus on performance isolation."
- sources/2025-02-25-allthingsdistributed-building-and-operating-s3 — S3's isolation approach: patterns/data-placement-spreading puts a bucket's objects on disjoint drive sets so any one workload is a negligible fraction of any one drive's load; concepts/aggregate-demand-smoothing makes this tractable at S3 scale even though the same spread would widen the blast radius at EBS scale. A useful worked contrast between the two systems.