Skip to content

PATTERN Cited by 1 source

Automated upstream cherry-pick bot

Intent

Continuously mirror merged upstream OSS pull requests into a privately-modified fork without human intervention on the common path, while gracefully handling conflicts, release-branch propagation, and completeness auditing. Replaces batch catch-up sync at release time with a steady-state flow that keeps the fork mergeable at all times.

When to use it

  • You maintain a private fork of an actively-developed OSS project with a non-trivial private diff.
  • You track multiple OSS branches — typically main plus one or more release branches — and have to propagate changes across both OSS-to-private and release-to-release axes.
  • Manual cherry-picks or batch tools like git-replay have started dominating engineering time.
  • You can accept draft PRs as the conflict-escalation surface; full auto-resolution isn't a hard requirement.

Mechanism

Three independent loops share a state store:

1. Incremental PR discovery (fast path, hourly cron)

  • Fetch recently closed PRs from the OSS repo via an API client (e.g. go-github).
  • Stop walking history on the first PR whose merge time predates the newest PR already in the state store.
  • Filter out PRs closed without being merged.
  • Insert new PRs into the state store as "pending cherry-pick."

2. Cherry-pick + PR creation (fast path)

For each pending PR, in merge order:

  • Check out the private fork on the mirror branch (e.g. upstream for OSS main).
  • git cherry-pick the upstream commit.
  • Open a PR regardless of outcome — don't fail the pipeline on conflict.
  • On clean cherry-pick: inherit PR title, description, author / merger as assignee (skip assignment if the OSS author is not an internal contributor). Wait for normal review.
  • On conflict: open as draft with labels like do not merge + Conflict; post the git status output as a comment; tag the original PR author; let a human resolve.
  • Mark the PR as "processed" in the state store.

See patterns/draft-pr-for-conflicts for the draft-PR escalation primitive.

3. Release-branch backport (opt-in, fast path)

Rather than automatically backporting every upstream PR to every latest-x.0 release branch:

  • Let authors apply a label like Backport to: latest-x.0 to the private PR.
  • On the next bot cycle, cherry-pick that PR to the specified release branch's mirror, opening a new PR (draft if conflict).

See patterns/label-triggered-backport — author-as-decision-maker matches the real decision boundary.

Out-of-band: weekly reconciliation

A slower sweep audits whether the fast path actually maintained its invariants — see patterns/weekly-reconciliation-check. Without this check the fast path's silent-omission failure modes accumulate undetected.

Infrastructure

  • Compute: scheduled GitHub Actions cron jobs — no dedicated server. Cheap, observable, low-ceremony. See patterns/stateful-github-actions-cron.
  • State: an external database keyed by upstream PR ID, tracking cherry-pick status. Required to make incremental PR discovery efficient and to support reconciliation. PlanetScale use their own product here but any relational DB works.
  • Auth: GitHub token with write access to the private fork and read access to the upstream repo.
  • API client: go-github or equivalent. Respect rate limits on incremental fetches.

Participants

  • OSS repo — source of truth for upstream PRs.
  • Private fork repoupstream branch mirrors OSS main; latest-x.0 branches mirror OSS release-x.0.
  • State database — tracks per-PR processing status.
  • GitHub Actions cron — scheduled compute.
  • Developers — resolve conflict PRs, apply backport labels, triage reconciliation issues.

Variations

  • Conflict-resolution memoisation (concepts/conflict-resolution-memoization) can be layered in before the draft-PR escalation so repeated conflicts resolve without human intervention. PlanetScale's systems/git-replay predecessor had this; the replacement bot dropped it in favour of draft-PR escalation.
  • Automatic backport — blanket-backport every change to every release branch. Higher automation, lower selectivity. Works when the team has strong ownership and wants every fix everywhere by default.
  • Self-hosted scheduler instead of GitHub Actions — e.g. a Kubernetes CronJob or a dedicated bot server. More operational surface area, but avoids GitHub Actions scheduling limits for high-frequency cases.

Consequences

Benefits

  • Continuous flow rather than pre-release catch-up.
  • Private diff doesn't need explicit maintenance — OSS changes arrive continuously, private changes land normally against the private upstream.
  • New release branches cost nothing — cut a private latest-x.0 and the bot handles it.
  • Clear accountability for conflicts — original PR author (not a central maintainer) is tagged on their own conflicts.
  • Observable — every PR in the private repo is a real reviewable artefact; history is auditable.

Costs / pitfalls

  • Bootstrapping is non-trivial — seeding the state store with already-processed history needs care; skipping this step either duplicates or omits PRs.
  • Conflict-backlog risk — if the private diff is large enough that most cherry-picks conflict, the draft-PR queue becomes its own bottleneck. Measure conflict rate before trusting steady-state.
  • Label-discipline dependency — backport completeness requires authors to apply labels. Missing labels are silent until the weekly reconciliation catches them.
  • State DB is load-bearing — DB outage stalls the bot.
  • GitHub Actions scheduling is best-effort — cron jobs can be delayed under GitHub load. For use cases with strict timing requirements, consider a self-hosted scheduler.

Canonical instantiation

systems/vitess-cherry-pick-bot — PlanetScale's bot that continuously mirrors OSS Vitess PRs into their private fork, with label-triggered backports to release branches and weekly reconciliation. Retrospective reports "a year and six months" of production use. (Source: sources/2026-04-21-planetscale-automating-cherry-picks-between-oss-and-private-forks)

Last updated · 319 distilled / 1,201 read