PATTERN Cited by 1 source

Mirror-first repack validation¶

Mirror-first repack validation is the pre-production step for a structural Git-repo rewrite (like a server-side repack) on a managed platform: run the exact target operation on a mirror of the repo first, measure production-relevant metrics, accept/reject the tradeoff, then schedule the live rollout.

Why¶

A Git repack changes how billions of objects are physically organised on disk — the code content is unchanged, but every clone, fetch, and push interacts with the new layout. On a managed SaaS like GitHub, the platform relies on server-side structures (bitmaps, delta islands) that the repack flags interact with. Running the repack directly in production without validation risks:

Fetch/clone performance regressions at the tail (new layout might happen to penalise a specific access pattern).
Push regressions (receive-pack rebuilding deltas differently against the new pack).
API-latency regressions on the platform's code APIs.
Edge cases where specific repos / refs interact badly with the chosen --window / --depth values.

Mirror-first validation turns all of these into data before any engineer or CI job is affected.

Solution shape¶

Create a mirror. Clone the target repo as a full --mirror into a parallel location on the platform (not in production serving).
Run the target repack on the mirror — with the exact flags and parameters planned for production (Dropbox + GitHub: --window=250 --depth=250).
Measure production-shaped metrics against the repacked mirror:
Fetch duration distribution (p50/p90/p99, tail movement).
Push success rate.
Platform API latency.
Clone time from a representative client.
Decide on the tradeoff. Compression ratio is known (Dropbox's mirror: 78 GB → 18 GB, ~4× reduction); accept "minor movement at the tail of fetch latency" if that tradeoff buys a 4× size win, reject otherwise.
Schedule production rollout gradually (platform-side one-replica-per-day is GitHub's standard cadence; see patterns/server-side-git-repack).

What it looks like in practice (Dropbox 2026-03-25)¶

Mirror: full git clone --mirror of the server monorepo.
Repack run by GitHub on the mirror with the chosen --window=250 --depth=250 configuration.
Result: 78 GB → 18 GB; minor movement at the fetch-latency tail (explicitly deemed acceptable for the 4× size reduction); push success and API latency held.
Production rollout followed: one replica per day, ~1 week, with read-write replicas first and rollback buffer at the end.
Final result: 87 GB → 20 GB, clone time >1h → <15 min, no regressions.

Relation to other validation patterns¶

patterns/shadow-migration — same discipline (dual-run / compare on representative inputs before consumer switchover) applied to compute-engine migrations (Spark → Ray at Amazon BDT); patterns/mirror-first-repack-validation is its VCS-infrastructure analogue.
patterns/staged-rollout — the general rollout family; mirror-first-repack is the pre-stage validation, then replica-by-replica deployment is staged rollout proper.

Caveats¶

Cost: running the repack twice (mirror + production) doubles compute time; the mirror-side cost is cheap insurance, not free.
Mirror ≠ production traffic. Fetch-latency measurements against a mirror capture the geometry of the new pack files but not the full live traffic mix (concurrent pushes / receive-pack contention / etc.). That's why the subsequent production rollout still needs replica-by-replica ramping.
Tradeoff decisions can't be automated — Dropbox / GitHub explicitly judged the tail movement as acceptable. There is no universal "good enough" threshold; it's a call against the compression win and the alternative (hit a 100 GB repo limit with no fix).

Seen in¶

sources/2026-03-25-dropbox-reducing-monorepo-size-developer-velocity — Dropbox + GitHub test-mirror repack (78 GB → 18 GB) preceded production rollout.

patterns/server-side-git-repack — the overall fix pattern this validates.
patterns/shadow-migration — sibling validation pattern on data-engine migrations.
patterns/staged-rollout — the rollout family this sits alongside.
systems/git / systems/github / concepts/git-pack-file — substrate.