Skip to content

PATTERN Cited by 1 source

Middle-tier storage media

Context

Storage architectures at hyperscale settle into media tiers — HDD below, flash above. Over time, each tier's economics shift: HDD areal-density climbs while its BW/TB falls (head seeks are flat while capacity grows); flash gets denser + cheaper but remains expensive relative to HDD.

Workloads that previously fit HDD's BW/TB band can stop fitting when HDD density crosses a threshold. They become stranded on HDD — technically on the drive, but suffering IOPS-per-byte starvation. Promoting them to the flash tier costs more than the workload justifies. Overprovisioning HDDs (buying more drives for IOPS) defeats the capacity/$ rationale of HDD.

A gap opens between the top of the HDD tier and the bottom of the flash tier.

The pattern

Insert a new media tier whose cost/performance/endurance profile sits between the two incumbents. Drive-level density lower than top-tier flash but substantially higher than HDD; BW/TB higher than HDD but lower than top-tier flash; cost/byte between the two.

Discipline:

  1. Identify the stranded-workload band. What BW/TB range is underserved? Meta 2025: ~10-20 MB/s/TB, the band where 16-20 TB HDDs were adequate and where large-batch-IO workloads currently overpay for TLC.
  2. Name the media that fills it. Often a denser-bit-cell variant (QLC here) that density-scaling has made economical.
  3. Validate the workload-endurance match. The new media's endurance floor must be met by the target workload's write profile with headroom. Meta explicitly matches QLC to read-BW-intensive + low-write workloads.
  4. Compose form factor + software stack. New media often exposes new asymmetries (R/W), new constraints (package count), new interfaces (userspace FTL). Each needs its own design.
  5. Co-design with a vendor partner willing to move at your pace. Meta + Pure Storage (DFM) is the canonical 2025 instance; Meta + NAND vendors for the standard-NVMe path.
  6. Accept hybrid-cost honesty up front. The new tier is usually not yet cost-competitive with the lower tier; early deployments are justified by power efficiency, density, and the cost of not solving the stranded-workload problem — not by total cost per byte.

Canonical instance: Meta QLC 2025

Meta's 2025-03-04 post introduces QLC flash between HDD and TLC flash:

Tier BW/TB Role
HDD ~5-10 MB/s/TB (falling) Cold bulk
QLC 10-20 MB/s/TB Batch IO / read-BW-intensive (new)
TLC 50+ MB/s/TB Mixed / write-heavy

Meta's density target: the densest TLC-based server today, with individual QLC drives scaling to 512 TB (U.2-15mm standard) or 600 TB (Pure Storage DFM).

When to apply

  • Two-tier structure under stress: bottom tier's BW/TB has fallen below workload requirements; top tier is overkill + overpaid.
  • New media available that covers the gap at a materially different cost / power / density profile.
  • Target workloads exist whose shape (here: read-BW-intensive + low-write) matches the new media's strengths.
  • Willingness to invest in the stack transformations — form factor, software, operations, migration.

When NOT to apply

  • Workload volume too small for a new tier's amortised operational cost to pay off.
  • New media's differentiation is marginal (e.g., second-source of existing tier, not a new band).
  • Workload shape doesn't match the new media's endurance/asymmetry profile — forcing mixed workloads onto QLC would burn endurance and force the rate controller into degenerate regimes.

Trade-offs

  • Operational complexity grows — three tiers means three form factors, three software paths, three hardware-vendor relationships.
  • Migration takes years at hyperscale. You plan for multi-year phased adoption.
  • Cost-per-byte parity not required at launch — power savings, density, and avoided-stranding can justify the deployment even at a cost premium. Meta is explicit.

Adjacent patterns

Seen in

Last updated · 319 distilled / 1,201 read