SYSTEM Cited by 2 sources

Magic Pocket¶

Definition¶

Magic Pocket is Dropbox's in-house block-storage system — the exabyte-scale substrate that stores the majority of Dropbox customer file data. Built as part of Dropbox's 2015 migration off Amazon S3 and onto self-operated datacenters, Magic Pocket runs on Dropbox's custom-built server hardware across tens of thousands of servers with millions of drives. As of 2025, Dropbox is in what the team calls its "exabyte era", having scaled from 40 PB in 2012 to 600 PB in 2016 and past 1 EB since.

Magic Pocket is the destination of the 2015 migration project that brought >90% of roughly 600 PB of US customer data from off-premises hosts to Dropbox-managed datacenters.

Fleet composition (2025)¶

>99% of the storage fleet uses SMR (shingled magnetic recording), a migration that began in 2018 and has steadily displaced traditional PMR drives generation over generation.
Drive capacity today: 30+ TB per drive, with the 7th-gen Sonic platform adopting Western Digital's Ultrastar HC690 32 TB 11-platter SMR drive.
Drive capacity 2012-era: ~14 TB or less; capacity has roughly doubled over the measured window.
Storage hardware lives in the Sonic chassis of Dropbox's 7th- generation hardware rollout, co-designed with drive vendors for vibration, acoustic, and thermal control at 30+ TB densities.

Why Magic Pocket exists¶

Dropbox's public framing: a product-first, vertically integrated infrastructure strategy. Running customer data on self-operated, self-designed hardware lets Dropbox:

Optimize cost per TB across a storage profile (read-mostly, capacity-dominant, cold-leaning) that doesn't match cloud providers' generalist storage pricing.
Co-design hardware and software: the filesystem / storage software and the chassis/firmware evolve together. See concepts/hardware-software-codesign.
Absorb drive-technology transitions on its own timeline — Dropbox was an early adopter of SMR (2018) and an early adopter of the 32 TB Ultrastar HC690.
Control placement, durability, and heat-management policy (see concepts/heat-management) end-to-end.

Design constraints¶

Magic Pocket operates under the concepts/hard-drive-physics constraint set — capacity per drive climbs, per-drive IOPS stays roughly flat — plus Dropbox-specific constraints surfaced by the 2025 seventh-gen post:

Storage bandwidth per PB as a top-line metric: internal floor 30 Gbps/PB, expected future systems >100 Gbps/PB, 7th-gen chassis design target >200 Gbps.
SAS topology reworked in 7th-gen to allocate bandwidth evenly across drives.
Vibration envelope — as drive platter counts climb (11 in Ultrastar HC690), head position error signal (PES) events increase; 10k-RPM cooling fans compound the problem. Magic Pocket's chassis co-design addresses this directly.
Network fabric — new 400G-ready datacenter architecture paired with the 7th-gen rollout; see the 400G post.

Scale numbers (2025)¶

Metric	Value
Servers	Tens of thousands
Drives	Millions
Data	Exabytes
Fraction of fleet on SMR	>99%
Datacenter self-ownership (since 2015)	>90% of stored data
7th-gen bandwidth target (per chassis)	>200 Gbps
Internal floor: bandwidth per PB	30 Gbps/PB

Evolution across hardware generations¶

Magic Pocket's software is relatively stable; the hardware platform underneath it is versioned. The 7th generation (2025) is the latest:

Gen 6 (Cartman, ~2020): AMD EPYC 7642 Rome 48-core; DDR4 256 GB; 25G NIC.
Gen 7 (Crush/Dexter/Sonic, 2025): AMD EPYC 9634 Genoa 84-core; DDR5 512 GB; 100G NIC; NVMe gen5; co-developed storage chassis with 32 TB SMR; GPU tiers (Gumby, Godzilla) added for AI workloads like systems/dropbox-dash.

Each generation progressively increases cores-per-rack, bandwidth-per- chassis, and drive-capacity-per-chassis while holding rack count roughly constant (46 Crush servers per rack in gen-7, same 1U "pizza box" form factor as prior generations).

Relationship to Dropbox Dash and AI¶

Magic Pocket is a file-data store. Dash's retrieval/indexing workloads sit on top of it (directly or via derived indices). The 7th-gen hardware rollout introduced dedicated GPU tiers (Gumby for mixed workloads, Godzilla for dense LLM training) specifically to serve Dash and related ML features — not to displace Magic Pocket's HDD-dominant storage footprint.

Data model: immutable volumes + erasure coding¶

Magic Pocket is an immutable blob store. User files are broken into blobs (chunks of binary data — part or all of a user file) and stored in fixed-size volumes across the storage fleet. Core invariants:

Blobs are never modified in place. Update or delete writes new data; old data remains until reclaimed.
Volumes are closed once filled, and never reopened. Freeing space means rewriting live blobs into new volumes and retiring the old ones — i.e., compaction.
Erasure coding (concepts/erasure-coding) is used for nearly all data — splits data into fragments + parity across machines; equivalent fault tolerance to replication at significantly less capacity overhead. The Live Coder service writes directly into erasure-coded volumes, bypassing the earlier replicated-then- re-encoded background path ( write-amplification win).

Reclamation pipeline¶

Reclaiming space in an immutable store is a two-stage pipeline (see concepts/garbage-collection):

Garbage collection — identifies blobs no longer referenced and marks them safe to remove. No disk space freed yet.
Compaction — gathers live blobs from partially-drained volumes, writes them into new volumes, retires the old ones. The only mechanism that actually frees capacity.

At scale Magic Pocket processes "millions of deletes each day"; without continuous compaction, volumes gradually become partially filled and storage overhead climbs.

Compaction strategy (2026 redesign)¶

After a new EC write path (the Live Coder service) rolled out and produced a long tail of severely under-filled volumes (<5% live data per volume in the worst cases), the single baseline compaction strategy could not reclaim overhead quickly enough. Magic Pocket deployed multi-strategy compaction (patterns/multi-strategy-compaction) over disjoint segments of the volume fill-level distribution:

L1 — baseline "host-plus-donor" packing. Pick a highly-filled host, pick donors whose live bytes fit the host's free space, rewrite. Optimal when most volumes are already highly filled (steady state). Average reclaim: <1 full volume per run (only donors fully drain).
L2 — dynamic-programming multi-volume packing. For the middle of the distribution (moderately under-filled volumes). DP over (volume-index, count, capacity) with granularity scaling + max-volumes-per-run cap; backtracks for the max-packing combination fitting one destination volume. Production impact vs L1-only cells: 30–50% lower compaction overhead over a week, 2–3× faster overhead reduction.
L3 — streaming re-encoding via Live Coder. For the sparsest tail. Continuously feeds live blobs from near-empty volumes into the Live Coder erasure-coder; reclaims each source volume immediately once drained. Per-reclaimed-volume rewrite work is low (sparse by construction); metadata cost is high because every blob gets a new volume identity → new location entry in Panda. Canonical patterns/streaming-re-encoding-reclamation.

Strategies run concurrently over disjoint eligibility boundaries with per-strategy rate limits + cell-local traffic (no cross-DC compaction). Per-strategy candidate ordering:

L1: conservative (keep placement risk + metadata load low).
L2: aggressive (denser packings → more reclaim per run).
L3: sparsest-first (minimize per-reclaimed-volume rewrite work).

Host eligibility threshold: replaced a static per-strategy knob with a dynamic control loop driven by fleet overhead signals — rising overhead ⇒ raise threshold (prioritize high-yield runs); stabilising overhead ⇒ lower threshold (stay responsive to deletes). Same primitive family as Robinhood's PID over load-balancing weights.

Post-rollout: overhead driven below the pre-incident baseline.

Why this matters¶

Quote from the post:

Storage overhead directly determines how much raw capacity we need in order to store the same amount of live user data. Even small changes in overhead materially affect hardware purchases and fleet growth.

Against a fleet of tens of thousands of servers, millions of drives, >99% SMR, exabyte-scale live data — small overhead deltas translate directly into meaningful capex. Distribution-shape-aware multi-strategy compaction + dynamic-control-loop tuning are the architectural tools that keep the storage growth curve predictable.

Seen in¶

sources/2025-08-08-dropbox-seventh-generation-server-hardware — 7th-gen hardware refresh; Magic Pocket's current substrate (Crush / Dexter / Sonic platforms), >99% SMR adoption, exabyte scale, 32 TB Ultrastar HC690 first-mover, >200 Gbps/chassis design target, chassis co-development to handle 30+ TB drive vibration envelope.
sources/2026-04-02-dropbox-magic-pocket-storage-efficiency-compaction — immutable-volume data model + two-stage reclamation (GC → compaction); the Live Coder incident (long tail of <5%-live volumes, fleet-wide overhead spike) as the forcing function for multi-strategy compaction (L1 baseline / L2 DP-packing / L3 streaming re-encoding); dynamic control-loop tuning of the host eligibility threshold; Panda metadata as the binding downstream constraint on L3 aggressiveness; overhead driven below the pre-incident baseline.

External references¶

Inside the Magic Pocket (2016 — original public introduction)
Magic Pocket infrastructure
Four years of SMR storage (2022)