Skip to content

PATTERN Cited by 1 source

Incremental AI re-review

Intent

When an AI code-review agent runs again on a merge request that already has prior findings, don't treat the new run as a blank slate. Feed the agent its own past review comment + the inline DiffNote thread state + user replies, and apply explicit state-transition rules so that:

  • Fixed findings auto-resolve their threads.
  • Unfixed findings re-emit to stay alive.
  • User-resolved findings are respected.
  • User replies (won't fix, acknowledged, I disagree) are read and honoured.

The re-run is a continuation, not a repetition.

Why it matters

In any iteration loop with an AI reviewer, the re-run case dominates. Cloudflare's data: average 2.7 reviews per MR. A system that degrades on re-run degrades on the majority of its workload.

Three degradation modes the pattern prevents — see concepts/ai-rereview-incremental for the detailed list:

  1. Re-flagging already-fixed issues (developers stop trusting the tool).
  2. Breaking comment-thread state (every run opens new DiffNotes; old discussions lost).
  3. Ignoring human override (won't fix gets re-litigated on every push).

Mechanism

State the coordinator receives at re-run

  • Full text of last review comment — not a hash; the actual prose verdict + enumerated findings.
  • Prior inline DiffNotes + per-thread resolution status (open / resolved / closed).
  • User replies on each DiffNote, threaded — the coordinator sees what the human said, not just that they said something.
  • New diff vs. the previously-reviewed baseline.
  • Prior break-glass overrides (if any) — so the coordinator doesn't re-run on a break-glassed MR in the first place.

State-transition rules

Prior state On re-run
Fixed Omit from output → MCP server auto-resolves the thread.
Unfixed (same severity) Re-emit identically → MCP server keeps thread alive.
Unfixed (worsened) Re-emit with updated severity / scope.
User resolved Respect, unless issue has materially worsened.
User replied "won't fix" / "acknowledged" Treat as resolved.
User replied "I disagree" Read the justification; either concede (resolve) or argue back with supporting evidence.

The "argue back" path is explicit: "If they reply 'I disagree', the coordinator will read their justification and either resolve the thread or argue back."

MCP comment server as state keeper

The thread state lives in the VCS (GitLab DiffNotes). The MCP comment server plugin is the coordinator's interface to it:

  • Auto-resolve threads whose findings are omitted from the new output.
  • Keep alive threads whose findings are re-emitted.
  • Open new threads for genuinely new findings.
  • Read existing threads + replies back into the coordinator's prompt assembly on the next re-run.

Shape

commit N+1 pushed
orchestrator detects re-run:
  ├── fetch last review comment (MCP)
  ├── fetch prior DiffNotes + status (MCP)
  ├── fetch user replies (MCP)
  └── compute new diff vs. reviewed baseline
coordinator prompt assembly:
  ├── previous_review section
  ├── existing_inline_findings section
  ├── user_replies section (per thread)
  └── changed_files section
     ▼  run coordinator + sub-reviewers
judge pass applies state-transition rules
MCP comment server:
  ├── auto-resolve threads for omitted findings
  ├── keep threads alive for re-emitted findings
  ├── open new threads for new findings
  └── post revised top-level review comment

Bias-toward-approval on re-runs

Cloudflare's coordinator applies the same approval rubric on re-run as on first run:

  • If the new diff addresses all prior criticals, verdict transitions significant_concernsapproved_with_comments.
  • The MCP auto-approval call (POST /approve) is idempotent; re-running it is safe.

Without this discipline, a merge would stay blocked until a human manually flipped the bot's prior unapprove. With it, the bot itself resolves its prior block as soon as the diff fixes the issues.

Tradeoffs

  • Coordinator context cost goes up on re-run — it's now consuming its own prior output as input. For MRs with many prior comments, this can hit context-window limits; hence the "coordinator prompt >50% context window → warn" guardrail.
  • State-transition rules are judgment calls. "Materially worsened" is LLM-evaluated, not pattern-matched. Can be wrong; the break-glass escape hatch is the safety net.
  • Argue-back can feel adversarial. A coordinator that keeps re-posting "I still think this is a bug" after a human says "I disagree" can burn trust faster than not having the feature.
  • Thread-state drift. If someone manually resolves a thread in GitLab outside the bot's knowledge, the bot's re-run may re-open it. Requires careful MCP sync.

Sibling patterns

Seen in

Last updated · 200 distilled / 1,178 read