Skip to content

PATTERN Cited by 1 source

Daily-diff cron for automated migration

Summary

The delivery channel for a large-scale automated migration: an internal cron system produces a daily batch of migration diffs against user-defined selection criteria, auto-routes them to relevant reviewers, runs tests + other validations, and ships the diff once a human approves. An on-demand web UI triggers the same pipeline on a specific file or module. Functionally equivalent to a "PR bot", embedded in the organisation's existing diff- review infrastructure.

Why this shape

For a migration that translates many files over months or years, three failure modes appear in the delivery story:

  1. Review attention is the binding constraint. Any migration that ships more diffs than reviewers can evaluate stalls. The right throttle is at the diff- generation stage, not the review stage.
  2. Review routing is hard. Assigning reviewers by file ownership, recency of activity, or language expertise is per-team tribal knowledge. Automating this is a non-trivial engineering investment on its own.
  3. Commit storms damage review norms. A burst of 10,000 simultaneous diffs degrades the organisation's review culture and incentives rubber-stamping. A smoothed daily cadence is healthier.

A daily-diff cron addresses all three: the batch size is knob-tunable, reviewer routing is auto-assigned per organisational policy, and diffs arrive at a steady rate.

Canonical wiki instance — Meta's migration cron

Meta's Kotlinator ships through this exact mechanism. From the 2024-12-18 Meta post:

"Meta has an internal system that allows developers to set up what is essentially a cron job that produces a daily batch of diffs [...] based on user-defined selection criteria. This system also helps choose relevant reviewers, ensures that tests and other validations pass, and ships the diff once it's approved by a human. We also offer a web UI for developers to trigger a remote conversion of a specific file or module; behind the scenes, it runs the same process as the cron job."

Three operational properties worth highlighting:

  • Selection criteria are developer-owned. A developer targeting a specific module sets the selection rule; the cron system operationalises it.
  • On-demand path shares the pipeline. The web UI invokes the same transformation process — no parallel code paths, no drift between scheduled and ad-hoc runs.
  • Shipping gate is human approval. Tests + validations must pass; humans sign off; then ship.

Relationship to Diff Authoring Time (DAT)

Meta's engineering blog has canonicalised diff authoring time as the productivity-measurement primitive for engineering work (engineering.fb.com DAT post). The daily-diff cron pattern deliberately zeroes out DAT for the migration work itself — developers don't author the diff, the bot does. Their time is spent reviewing, which is measured separately.

Pattern preconditions

  • Organisation has code-review infrastructure with programmable diff submission (Meta's "diffs", GitHub PRs, GitLab MRs, etc.).
  • Test + validation suite can run on candidate diffs before human review.
  • Reviewer-routing policy exists and is automatable.
  • Migration pipeline is idempotent on a per-file basis — the cron can safely re-try or re-run a file without corrupting state.
  • A selection-criteria DSL or API exists so developers can target subsets (by file, module, owner, age, etc.).

Contrast with alternative delivery channels

  • One gigantic "big bang" PR. Unreviewable; rejected by every modern organisation at monorepo scale.
  • Per-developer interactive workflow. Doesn't scale past a few thousand files.
  • Synchronous CI triggered on source-file change. Wrong semantics — migration should be opt-in, not enforced on every push.
  • Manual batch runs. Human operator becomes the bottleneck; no cadence discipline.

The cron pattern is the only one that combines opt-in + cadence + automation + human-gated ship.

Seen in

Last updated · 550 distilled / 1,221 read