PATTERN Cited by 1 source
Automated migration at monorepo scale¶
Summary¶
The wrapping architectural pattern for large-scale, language-level, or framework-level code migrations inside a monorepo with tens of millions of lines of code: scale- out IDE-derived tooling via headless execution, per-file fan-out on a server fleet, a daily-diff cron as the delivery channel, and human review only at the final diff stage. The scale threshold: the naive per-file IDE-button workflow is no longer viable (at Meta's ~100,000 Android files, a couple of minutes per file = ~100,000 × 2 minutes of developer attention). Automation is the only path.
Component primitives¶
- Headless IDE inspection — take inspections / refactorings that are usually clicked in an IDE and drive them from a server process.
- Pipeline with open-ended passes — the architectural shape of the transformation itself, so new corner cases are additive work.
- Meta- programming on broken code — the AST-tooling choice that lets intermediate stages proceed even when the source doesn't compile.
- Build-error- driven fix loop — the final phase that uses the compiler as its oracle for residual fixes.
- Daily-diff cron — the delivery channel turning the pipeline into a steady stream of PRs.
- Bot-safer-than- human — the discipline of pushing delicate transformations into the pipeline to reduce reviewer error.
- [[patterns/upstream-collaboration-as-migration- unblock|Upstream collaboration]] — the escape valve when the pipeline hits the ceiling of the underlying tool.
Canonical wiki instance — Meta's Kotlinator¶
Meta translated >40,000 files (of ~100,000) of its ~10M-line Android monorepo from Java to Kotlin using Kotlinator, a six-phase pipeline that composes every primitive above. Key scale datapoints:
| Dimension | Datum |
|---|---|
| Starting Java LOC | ~10,000,000 |
| Files to convert | ~100,000 |
| Conversions shipped | >40,000 (at 2024-12) |
| Remote conversion time | ~30 min / file |
| Pipeline custom steps | >200 |
| Kotlin-first Android at Meta since | 2020 |
The trade-off the pattern makes: total wall-clock time per file goes up, developer time per file goes to zero. At monorepo scale, this is net-positive because files convert unattended, in parallel, and human attention is the binding constraint.
When to use / when not¶
Use when:
- File count is ≥ 10,000 in the migration scope.
- The transformation is deterministic or tractably- deterministic with custom steps (Java → Kotlin translation qualifies; arbitrary semantic refactoring does not).
- A headless mode of the underlying tool exists or can be built (see patterns/upstream-collaboration-as-migration-unblock).
- Code review infrastructure can handle a high daily diff volume.
Don't use when:
- File count is under a few hundred (IDE-button workflow remains viable).
- The transformation genuinely requires human semantic judgement per file (not just mechanical rewrite).
- There is no null-safety / type-correctness invariant bot-safer-than-human can exploit — the pipeline's safety story depends on invariants the bot can preserve.
Contrast with AI-agent-driven migration¶
Meta's pattern is deterministic AST transformations + compiler-error heuristics, not LLMs or agents. Contrast with:
- Instacart Jetpack Compose — AI-skill-based migration, smaller scope.
- Cloudflare vinext — AI-driven framework rewrite, not translation.
Meta's 2024-12 framing is implicitly the scale ceiling for rules-based tooling: at 10M lines, determinism wins because it's reviewable + debuggable + resumable. LLMs shine at smaller, more judgement-heavy workloads; rules- based shines when the transformation is mostly mechanical and the target is mostly syntactically well-defined.
Canonical side-effect: the tax on mixed-language monorepos¶
One load-bearing framing introduced by the Meta post is that the driver for finishing the migration isn't just the target-language benefits — it's the ongoing tax of running two toolchains in a monorepo:
"Compiling Kotlin is slower than compiling Java, but compiling both together is the slowest of all."
See also concepts/monorepo — this extends the monorepo cost framing with the mixed-language build-speed tax.
Seen in¶
- sources/2024-12-18-meta-translating-10m-lines-of-java-to-kotlin — canonical Meta-scale instance.
Related¶
- systems/kotlinator — the pipeline that operationalises the pattern.
- systems/j2k-converter — the underlying transformation engine.
- All component-primitive links listed above.