CONCEPT Cited by 3 sources

Heat management (storage)¶

Definition¶

Heat in a multi-tenant storage system is the number of requests hitting a given drive per unit time. Heat management is the ongoing placement and steering problem of keeping request load distributed as evenly as possible across the fleet so that no single drive becomes a hotspot.

Warfield (S3, 2025):

By heat, I mean the number of requests that hit a given disk at any point in time. If we do a bad job of managing heat, then we end up focusing a disproportionate number of requests on a single drive, and we create hotspots because of the limited I/O that's available from that single disk. For us, this becomes an optimization challenge of figuring out how we can place data across our disks in a way that minimizes the number of hotspots.

Why hotspots are dangerous even when drives don't fail¶

Hotspots don't take drives down; they queue requests. Queueing at a hot drive propagates upward:

The dependent layers (metadata lookup, erasure-coding reconstruct — see concepts/erasure-coding) were waiting on this I/O.
That wait is amplified through those layers (now their workers are busy).
The stalled requests become stragglers in the tail of the overall latency distribution.
At high enough fanout, some hotspot is always present, so the system's tail is set by hotspots (see concepts/tail-latency-at-scale).

Hotspots at individual hard disks create tail latency, and ultimately, if you don't stay on top of them, they grow to eventually impact all request latency.

The structural problem¶

We must decide placement at write time, before we know what the access pattern will be.
Individual workloads are bursty — idle for long stretches, then a large peak.
At S3's scale, though, millions of workloads aggregate into a remarkably smooth demand curve (see concepts/aggregate-demand-smoothing).
So the placement problem is: translate smooth aggregate demand into smooth per-drive demand, via placement policy.

Levers used in S3¶

Spread customer data across many drives. Different objects in the same bucket go on different disk sets. A customer's workload is served by millions of disks, any one of which sees a tiny fraction of that customer's IOPS. See patterns/data-placement-spreading.
Redundancy as a steering mechanism. Replication and erasure coding give the frontend multiple drives to read any given shard from. Pick a non-hot one. See patterns/redundancy-for-heat.
Erasure coding with k-of-(k+m). Splits an object into more pieces than are needed to read it. That extra provides scheduling flexibility. See concepts/erasure-coding.

Interplay with noisy neighbor¶

Heat management and concepts/noisy-neighbor are the same problem viewed from opposite ends: heat management asks "how do I spread load evenly?"; noisy-neighbor asks "how do I keep one workload from harming another?" Spread placement + redundancy-for-steering is S3's joint answer; compare EBS's approach in sources/2024-08-22-allthingsdistributed-continuous-reinvention-block-storage-at-aws.

Counterintuitive observation¶

Before joining Amazon, I spent time doing research and building systems that tried to predict and manage this I/O heat at much smaller scales — like local hard drives or enterprise storage arrays — and it was basically impossible to do a good job of. But this is a case where the sheer scale, and the multitenancy of S3 result in a system that is fundamentally different.

At single-host scale the workload is the problem. At S3 scale the aggregation becomes the solution.

Seen in¶

sources/2025-02-25-allthingsdistributed-building-and-operating-s3 — Warfield defines heat, hotspots, stragglers; lays out spread-placement + redundancy-as-steering.
sources/2025-08-08-dropbox-seventh-generation-server-hardware — heat management at the mechanical / chassis level, complementary to S3's placement-level framing. Dropbox's systems/sonic storage chassis co-designed to trade acoustic damping, airflow redirection, fan-curve tuning against the vibration/thermal envelope that 30+ TB SMR drives introduce. Drive temperature sweet spot ~40 °C — too hot and drives age faster, error rates rise; too cold demands slower fans, which is fine for temperature but you still need enough airflow to dissipate heat without vibrating the nanometer-precision heads off-track. Same "heat management" concept, a different layer of the stack from where S3 operates (drive-level mechanical rather than fleet-level placement).

Two layers of heat management¶

The two sources together cover heat management at two distinct layers:

Layer	Who	Mechanism	Lever
Fleet placement (millions of drives)	AWS S3	Spread customer data across many drives; use replication/EC as steering	patterns/data-placement-spreading, patterns/redundancy-for-heat
Chassis mechanical (per server / per rack)	Dropbox Sonic	Acoustic damping, airflow redirection, fan-curve tuning, co-designed drive chassis	concepts/hardware-software-codesign, patterns/supplier-codevelopment

These are complementary, not alternatives. S3 could not spread well if the underlying drives were thermally unstable; Sonic's chassis work would be wasted if software placement overloaded individual drives. The full heat-management problem spans both layers, and operators at Dropbox's or S3's scale invest in both. - sources/2024-03-06-highscalability-behind-aws-s3s-massive-scale — Kozlovski's 2024 High Scalability explainer summarises the same Warfield framing (heat = requests per disk, spread placement + redundancy as steering, workload decorrelation as the scale enabler) and adds a compact rendering of the two imbalance examples: a 3.7 PB / 2.3M IOPS bucket needs 143 drives for capacity vs 19,166 for IOPS (13,302% more); a 28 PB / 8,500 IOPS bucket inverts (71 for IOPS vs 1,076 for capacity, 1,415% more). Useful as a third-party numerical illustration of why HDD IOPS-vs-capacity divergence is the load-bearing physics constraint behind the heat-management problem.