PATTERN Cited by 1 source
Weekly reconciliation check¶
Intent¶
Run a lower-frequency, full-sweep audit that verifies the invariants the primary (fast-path) automation is supposed to maintain, and surface any violations for human triage. The fast path optimises for latency and forward progress; the weekly sweep optimises for completeness and catches silent omissions the fast path can't see.
When to use it¶
- You run a fast-path automation (e.g. an automated cherry-pick bot) that has invariants like "every merged upstream PR has a matching private PR".
- The fast path depends on signals it doesn't fully control — labels, events, external state — so can silently miss items.
- You can tolerate a detection delay of roughly one cron interval; hard real-time reconciliation would be a different pattern.
- You have an appropriate human triage surface — a dedicated issue, a dashboard, a pager, or a Slack channel.
Mechanism¶
Define the invariants¶
Reconciliation only makes sense if the invariants are crisp. For a fork-sync cherry-pick bot the invariants from the canonical source are:
Upstream-in-sync-with-OSS:
- Every OSS main PR has a corresponding cherry-pick PR in private
upstream that is either open or merged.
- No PRs are merged directly into latest-x.0 branches — they
must come via backport from upstream.
- No private cherry-pick PR against upstream is left open
indefinitely.
Latest-branches-consistent:
- Every PR merged to latest-x.0 was backported from upstream
(not direct-merged).
- No backport PR against a latest branch is left open
indefinitely.
- If a PR is backported to latest-x.0, it is also backported to
higher-numbered latest branches (where applicable).
Schedule¶
Separate from the fast path. Weekly cron is typical — cheap enough to full-sweep, rare enough that discoveries get human attention.
Data sources¶
- The bot's own state store (what it believes it has processed).
- The actual state of the repos (what's merged, what's open, what labels are applied).
The reconciliation is a join between these two views, looking for disagreements.
Output surface¶
A dedicated GitHub issue per check type is the canonical output: "The bot posts a summary of these checks to a dedicated GitHub issue every week, providing visibility into any issues that may require manual inspection or action." (Source: sources/2026-04-21-planetscale-automating-cherry-picks-between-oss-and-private-forks)
The issue accumulates history across weeks (as comments) and serves as a public, auditable record of drift incidents and resolutions.
Why weekly?¶
- Full-sweep cost: scanning all PRs in scope and joining against state is more expensive than the fast path's incremental run. Weekly is the natural cadence that stays cheap.
- Noise: invariant violations should be rare. Reporting more often than they happen just trains humans to ignore the issue.
- Review cadence alignment: a weekly cadence aligns with most team rhythms — Monday-morning triage catches everything from the previous week.
- Regression detection lag: if the fast path breaks on Tuesday, weekly reconciliation catches it by the following Monday — bad for an urgent production system, fine for a bot that nobody notices is broken until a release ships wrong.
Contrast with continuous reconciliation¶
Continuous reconciliation (reacting to every event and comparing against state) is a different pattern — appropriate when drift must be detected within minutes (e.g. patterns/consistency-checkers in Dropbox Magic Pocket). Weekly is right when the fast path is "correct in aggregate" and the reconciliation is catching tail misses, not keeping up with primary load.
Consequences¶
Benefits¶
- Catches what the fast path can't see — missing labels, bypass merges, stalled PRs, incomplete backport fan-out.
- Auditable history — every week's report is a public record.
- Cheap to operate — one expensive query per week is rarely a concern.
- Clarifies ownership — the issue's assignees are the accountable party.
- Improves the fast path over time — repeated findings of the same kind become input to fast-path improvements.
Costs / pitfalls¶
- Detection delay — up to one cron interval before a violation is flagged.
- Requires crisp invariants — vague invariants produce useless audits.
- Noise tuning — a reconciliation that fires false positives (e.g. about PRs that are legitimately awaiting review) trains the team to ignore it. Tune the query.
- Humans have to actually read it — a report nobody reads is worse than no report; at least no report is honest.
- Integration with alerting — by default the report just sits in an issue. If the invariants are severity-graded, some should page rather than comment.
Variations¶
- Daily instead of weekly — for higher-stakes fast paths.
- Split into multiple checks with different cadences — cheap ones daily, expensive ones weekly.
- Issue-per-week vs comment-on-ongoing-issue — tradeoff between long history and alert fatigue.
- Slack / email digest in addition to the issue — for discoverability.
- Auto-remediation for common cases — e.g. if a cherry-pick PR is stale, bot can comment-ping the assignee; if a label is missing for a known bug-fix pattern, bot can apply it.
Canonical instantiation¶
systems/vitess-cherry-pick-bot runs two weekly reconciliation checks (upstream-in-sync-with-OSS and latest-branches-consistent) covering the invariants above, posted to a dedicated GitHub issue. "With these safety measures in place, we cautiously rolled out the new process and began using the Vitess cherry-pick bot. Over a year and six months later, the results have been remarkable."
Related¶
- patterns/automated-upstream-cherry-pick-bot
- patterns/draft-pr-for-conflicts
- patterns/label-triggered-backport
- patterns/stateful-github-actions-cron
- patterns/consistency-checkers
- patterns/repo-health-monitoring
- systems/vitess-cherry-pick-bot
- concepts/weekly-integrity-reconciliation
- concepts/anti-entropy