PATTERN Cited by 1 source
Server-side Git repack¶
Server-side Git repack is the pattern of permanently fixing a
Git repo's pack-file size on a managed Git host (e.g.
GitHub) by running an aggressive, host-supported
git repack on the host's servers, not on a local mirror.
Problem¶
A Git repo's on-disk and on-wire size is determined by how its
pack files are constructed. On a managed
SaaS, the host rebuilds the transfer pack dynamically per
request (see systems/github) from its own packing
configuration — so local git repack improvements pushed back up
are immediately re-derived from the server's own rules and do not
persist.
Consequence: if the default pack heuristic has gone pathological on your repo (canonical case: Git's 16-char path-pairing heuristic mismatching your directory layout), you cannot fix it from the client side. The repack has to run where the authoritative pack files live.
Solution shape¶
-
Reproduce locally to confirm the root cause.
git clone --mirrorthe repo, run aggressive repack variants (git repack -adf --depth=250 --window=250with or without experimental flags like--path-walk) to show that a repack can produce the target size. Dropbox's local--path-walkexperiment hit 84 GB → 20 GB in ~9h, proving the hypothesis. -
Work with the platform provider to pick a server-compatible repack shape. Aggressive flags that bypass the default pairing heuristic (e.g. Git's
--path-walk) may be incompatible with server-side optimizations the platform uses to keep clone/fetch fast — GitHub specifically relies on bitmaps and delta islands that don't compose with--path-walk. The platform-supported alternative is usually to keep the default pairing heuristic but raise--windowand--depth(Dropbox + GitHub landed on--window=250 --depth=250). -
Mirror-first validation — see patterns/mirror-first-repack-validation. Have the platform repack a test mirror; measure fetch-latency distribution, push success rate, and API latency; accept any acceptable-tail movement as a tradeoff against the compression win.
-
Gradual production rollout. Platform performs the repack one replica per day over a week, read-write replicas first, buffer time at the end for rollback. A repack changes the physical layout of billions of objects for every interaction with the repo, so the ops discipline matches any other massive in-place infrastructure change.
-
Structural follow-up. Fix the root cause (directory layout, file naming) so the default heuristic works correctly going forward; stand up patterns/repo-health-monitoring so the next regression is visible early.
Why it matters¶
Generalises across any structural VCS / package-format issue where the authoritative store lives on a managed SaaS:
- The diagnosis can be done locally against a mirror.
- The fix has to run where the authoritative artefact is constructed.
- The rollout has to treat the artefact as production infrastructure (billions of clients/reads in flight).
Canonical instance:
Dropbox's 87 GB → 20 GB GHEC repack: confirmed root-cause with
a local --path-walk experiment, couldn't ship that flag, ran
tuned --window=250 --depth=250 on GitHub's servers via GitHub
Support, rolled replica-by-replica over a week, hit 77% size
reduction + clone time >1h → <15 min, no fetch/push/API-latency
regressions.
Operational details (2026-03-25 Dropbox instance)¶
- Production repack on GitHub's servers: tuned
--window=250 --depth=250(default heuristic retained for server-side compatibility). - Local experimental measurement for root-cause confirmation:
git repack -adf --depth=250 --window=250on agit clone --mirrorcopy of the repo. - Rollout cadence: one replica per day, ~1 week total.
- Test-mirror repack wallclock (local Dropbox measurement): ~9 hours.
Caveats¶
- Aggressive
--window/--depthtrade repack time for compression ratio — plan for multi-hour server-side work, not a minute-scale operation. - Server-side bitmaps / delta islands (GitHub) foreclose some repack flags; your fix space is bounded by what the platform supports.
- A repack does not alter the content of your code — just its physical pack layout — so semantic correctness is not at risk, but every client's clone/fetch/push path touches the new layout.
- Structural follow-up (directory layout reshape) is what prevents the issue from recurring; without it, the next round of translation updates (or whatever the domain-specific analogue is) starts re-inflating the packs.
Seen in¶
- sources/2026-03-25-dropbox-reducing-monorepo-size-developer-velocity — canonical instance on Dropbox's 87 GB server monorepo.
Related¶
- patterns/mirror-first-repack-validation — the pre-production validation step.
- patterns/repo-health-monitoring — the ongoing operational discipline.
- concepts/git-delta-compression — the root-cause class this pattern addresses.
- concepts/git-pack-file — what's actually being rewritten.
- systems/git / systems/github — the protocol + platform.
- systems/dropbox-server-monorepo — the monorepo this was applied to.