SYSTEM Cited by 1 source
Dropbox server monorepo¶
Dropbox server monorepo is the single Git repository that holds the majority of Dropbox's backend services and libraries. In Dropbox's own framing: "almost every product change flows through a single place: our server monorepo … the repository sits at the center of nearly everything we build."
The canonical day-in-the-life for this repo: pull latest, build, test, review, merge, ship. CI jobs repeatedly clone fresh. Both flows — engineer setup and CI — incur the repo's clone cost.
Scale (2026-03-25 post)¶
- Pre-fix size: ~87 GB.
- Pre-fix initial clone time: >1 hour.
- Pre-fix growth rate: 20–60 MB/day typical, spikes >150 MB/day.
- GHEC hard repo-size limit: 100 GB (post frames Dropbox was months from hitting it).
- Post-fix size: ~20 GB (77% reduction).
- Post-fix initial clone time: <15 min in many cases.
Language / content mix¶
Not fully disclosed. What is named in the 2026-03-25 post:
- Backend services + libraries are the bulk — the monorepo architecture is framed as easing cross-service change.
- AI features ranking systems / retrieval pipelines / evaluation logic / UI surfaces — small changes crossing multiple backend services; prime workload for the Dash engineering loop.
- Internationalization (i18n) files under
i18n/metaserver/[language]/LC_MESSAGES/[filename].po— not volumetrically large, but the pathological case for Git's delta-compression heuristic (language code falls before the 16-char trailing-path window → Git pairs.pofiles across languages instead of within one, producing oversized pack contributions from routine translation updates).
2026 repo-size incident / fix¶
Root cause identified as the 16-char trailing-path delta heuristic interacting badly with the i18n layout (concepts/git-delta-compression), not committed payload.
Fix path:
- Local reproduction:
git repack -adf --depth=250 --window=250 --path-walkon agit clone --mirrorcopy → 84 GB → 20 GB in ~9h; confirms the hypothesis. - GitHub Support reports
--path-walkincompatible with server-side bitmaps / delta islands; local fixes also don't survive GitHub's dynamic server-side transfer-pack construction (systems/github). - GitHub recommends server-side repack with tuned
--window=250 --depth=250(no--path-walk) — patterns/server-side-git-repack. - Mirror-first validation: 78 GB → 18 GB on a test mirror; fetch latency distribution / push success / API latency held within tradeoff tolerance (patterns/mirror-first-repack-validation).
- Production: gradual rollout by GitHub over a week, one replica per day, read-write replicas first, rollback buffer at the end.
- Result: 87 GB → 20 GB (77% reduction), clone time >1h → <15 min, no regressions.
Two follow-ups:
- i18n layout restructured so delta pairing falls inside the 16-char window (prevent regression at the source).
- Recurring repo-stats dashboard tracking overall repo size, growth rate, fresh clone time, and per-subtree storage distribution — patterns/repo-health-monitoring canonical instance.
Why this matters¶
This is a canonical wiki instance of structural, not behavioural, repo growth: no rogue binaries, no leaked deps, no generated files — the interaction between directory layout and Git's internal heuristic was the entire cause. The remediation sits at the boundary between Dropbox's code and GitHub's managed platform and required collaborative debugging + a mutually-supported fix shape. Both are load-bearing lessons for any large Git monorepo on a managed SaaS.
Seen in¶
- sources/2026-03-25-dropbox-reducing-monorepo-size-developer-velocity — primary source.
Related¶
- concepts/monorepo — the general architectural pattern.
- systems/git — the VCS whose internals drove the incident.
- systems/github — the managed platform that determines the fix shape.
- concepts/git-delta-compression — the specific failure mode.
- patterns/server-side-git-repack — the fix pattern.
- patterns/mirror-first-repack-validation — the validation step.
- patterns/repo-health-monitoring — the operational follow-up.
- companies/dropbox — company context.