Skip to content

CONCEPT Cited by 1 source

Write amplification

Definition

Write amplification (WA) is the ratio of physical bytes written to storage to logical bytes the application intended to write.

WA = (bytes written to disk) / (bytes written by application)

Forces that push WA above 1:

  • LSM compaction: every byte rewritten ~log(N) / B times as it merges up tiers / levels (concepts/lsm-compaction).
  • SSD page-level rewrites: an in-place update of a byte forces rewriting the whole flash page.
  • Replication / erasure coding: a single logical write becomes N replicas or (k + m) shards.
  • Immutable-store compaction: delete → live-blob rewrite during compaction to free the donor (concepts/storage-overhead-fragmentation).
  • Metadata: header / index / WAL updates around every logical write.

Any one of these can dominate; at scale, the WA term is often the first-order capacity / IOPS multiplier.

Magic Pocket — background-write WA as the Live Coder's design target

The immediate context from the 2026-04-02 post:

Last year, we rolled out a new service that changed how data is placed across Magic Pocket. The change reduced write amplification for background writes, so each write triggered fewer backend storage operations. (Source: sources/2026-04-02-dropbox-magic-pocket-storage-efficiency-compaction)

Before the Live Coder service, writes went through a replicated path first and were re-encoded into erasure-coded volumes as a background operation — that re-encode is itself a rewrite, so each logical byte written produced multiple physical bytes.

The Live Coder path writes data directly into erasure-coded volumes, skipping the replicated-then-re-encoded intermediate. Background writes fan out to fewer backend storage operations per logical byte.

The unintended consequence was a different cost axis — the new path produced severely-under-filled volumes, driving storage overhead up. The fix was multi-strategy compaction (L1 + L2 + L3) over the new distribution, with L3 itself reusing the Live Coder as a re-encoding pipeline. That latter choice is a deliberate trade: L3's rewrite WA is high on a per-blob basis (every live blob re-encoded gets a new identity + metadata entry), but applied only to the sparse tail, keeping absolute per-reclaimed-volume rewrite work low.

Axis summary

Source of WA Canonical system Shape
LSM merging Husky, RocksDB Amortized ~log(N) / B rewrites per byte
SSD erase-block Any flash-backed store Page-level-rewrite on sub-page update
Replication / EC Magic Pocket, S3 (k + m) / k or N / 1 multiplier
Compaction in immutable stores Magic Pocket, Husky Live bytes rewritten once per reclaim of their containing unit
Write-path layers Replicated-then-re-encoded writes One rewrite per intermediate tier

Design knobs that move WA

  • Choose a write path that skips a rewrite tier — Magic Pocket's Live Coder → direct EC writes (removed one rewrite); inline EC in S3 ShardStore.
  • Size reclaimable units to the workload — bigger units reduce per-byte compaction WA but increase reclamation latency.
  • Delay compaction until efficient-enough — Husky's lazy compaction is an order of magnitude cheaper than eager size-tiering.
  • Pick reclamation mechanism per segment — L1/L2 keep blobs under the same volume identity (low metadata WA); L3 rewrites into new volumes (high per-blob metadata WA, low per-reclaimed- volume total).

Relation to overhead

WA and storage overhead are orthogonal cost axes:

  • WA is a per-write rewrite cost — pays in I/O, compute, flash wear, CPU, metadata writes.
  • Overhead is a capacity-holding cost — pays in hardware fleet size.

The two interact at the compaction layer: compaction causes WA (rewrites live data) in order to reduce overhead. Picking a compaction strategy picks a trade between the two.

Seen in

Last updated · 200 distilled / 1,178 read