PATTERN Cited by 1 source
Coordinator / sub-reviewer orchestration¶
Intent¶
Decompose an AI-driven review / critique / analysis task into:
- A coordinator agent running on the highest-capability model tier.
- N parallel sub-reviewer agents, each scoped to a single domain / axis / rubric, running on right-sized cheaper models.
- A final judge pass inside the coordinator — dedup, re-categorise, drop false positives, read source to verify, emit structured output.
The coordinator is the only agent that sees the aggregate; sub-reviewers never see each other's output.
When to reach for it¶
- High-stakes analysis over rich input where a single-prompt review is too shallow and a dozen overlapping prompts too noisy.
- Separable domains — security vs. performance vs. code quality; legal vs. copy vs. SEO; correctness vs. completeness vs. policy.
- Cost-per-reasoning-unit differences across model tiers — the coordinator's judgement call is worth the top tier; fan-out rarely is.
- Structured-output consumers downstream (a CI gate, a ticket system, a dashboard) — the coordinator writes the schema-conforming final artifact.
Shape¶
input (MR, doc, artifact, ...)
│
▼
┌───────────────────┐
│ Coordinator │ ← top-tier model (Opus 4.7 / GPT-5.4)
│ (reads full ctx) │
└─────────┬─────────┘
│ spawn_reviewers(domains, shared_context_path)
│
┌─────┬───────┼───────┬─────┬──────┐
▼ ▼ ▼ ▼ ▼ ▼
[sec] [perf] [quality] [docs] [rel] [codex]
│ │ │ │ │ │
│ │ │ │ │ │ ← each runs on a cheaper tier
│ │ │ │ │ │ and reads shared-mr-context.txt
│ │ │ │ │ │ + per-file patches
▼ ▼ ▼ ▼ ▼ ▼
structured XML findings (critical/warning/suggestion)
│
▼
┌───────────────────┐
│ Judge pass │ ← dedup / re-categorise / drop FPs /
│ (coordinator) │ verify by reading source
└─────────┬─────────┘
│
▼
final structured output + action
(approve / approved_with_comments / block)
Mechanism¶
- Coordinator spawned as child process. Prompt via stdin to avoid
ARG_MAX; JSONL on stdout so the orchestrator can process events in real time. See patterns/jsonl-streaming-child-process. spawn_reviewersas a tool. The coordinator calls a runtime-plugin tool to launch sub-agents; it does not see their tools, prompts, or individual outputs — it only sees their final structured findings.- Specialised prompts per sub-reviewer. Each sub-agent has a tightly scoped prompt with an explicit "What NOT to flag" section — see concepts/what-not-to-flag-prompt.
- Shared-context file on disk. Sub-reviewers read
shared-mr-context.txt(and per-file patches) via file tool — never embedded in prompt — to exploit provider prompt caching. - Structured severity output. Findings carry a severity enum (
critical/warning/suggestion); downstream action is keyed off severity, not free-text. - Judge pass. Coordinator deduplicates (same issue from two reviewers kept once in the best-fit section), re-categorises (perf bug flagged by code-quality moves to perf section), drops speculative items, and may call its tools to read the source to verify an ambiguous finding.
- Timeouts at three levels. Per-task (5 min typical; 10 for heavier reviewers), overall (e.g. 25 min), retry-budget minimum (e.g. 2 min).
- Resilience. Circuit breaker + failback chains per model tier isolate sub-reviewer provider failures from the coordinator's judgment logic.
Why it works better than a single prompt¶
- Small tool inventory per agent → reliable tool selection (patterns/specialized-agent-decomposition).
- Narrow prompt per agent → better "what NOT to flag" discipline; less drift between domains.
- Independent parallelism → linear latency in critical-path.
- Deduplication + re-categorisation happens centrally → downstream consumer sees one clean output, not N overlapping ones.
- Coordinator judgment cost is amortised against N parallel sub-reviewer calls → top-tier model used only where it matters.
Tradeoffs¶
- Orchestration complexity. Plugin architecture, IPC, timeouts, failback — harder than a single prompt.
- Cost scales with diff size. Seven frontier-model calls on a 500-file refactor is real money. Pair with patterns/ai-review-risk-tiering to gate expensive tiers.
- Coordinator is the bottleneck. Its context must fit summarised sub-reviewer output + full MR metadata; Cloudflare warns when coordinator prompt exceeds 50% of estimated context window.
- Sub-reviewer output must be structured. If any sub-agent returns malformed output, the coordinator's judge pass has to handle it — see concepts/structured-output-reliability.
Contrasts with related patterns¶
- vs. patterns/planner-coder-verifier-router-loop — Google DS-STAR's role-based decomposition sits inside a refinement loop (plan → code → verify → route-add-or-fix → repeat). Coordinator / sub-reviewer is single-pass with fan-out; the "judge pass" is not a refinement round, it's a consolidation.
- vs. patterns/specialized-agent-decomposition — sub-reviewer is one realisation of the domain-based decomposition variant, with the coordinator explicitly staged as the consolidator.
- vs. concepts/llm-as-judge — sub-reviewer output consolidation uses judge-pattern discipline inside the coordinator; not every LLM-as-judge application is multi-agent.
Seen in¶
- sources/2026-04-20-cloudflare-orchestrating-ai-code-review-at-scale — canonical production instance. 7 specialised sub-reviewers (security, performance, code quality, documentation, release, AGENTS.md, engineering-codex) with coordinator on Opus 4.7 / GPT-5.4, sub-reviewers on Sonnet 4.6 / GPT-5.3 / Kimi K2.5. 131,246 review runs across 5,169 repos in first 30 days; median review 3m 39s; 1.2 findings per review; 85.7% prompt-cache hit rate driven partly by the shared-context file.
Related¶
- patterns/specialized-reviewer-agents — the sub-agent structure side of this pattern.
- patterns/specialized-agent-decomposition — the broader decomposition pattern family.
- patterns/jsonl-streaming-child-process — the embedding transport.
- patterns/incremental-ai-rereview — how coordinator state is re-read on re-runs.
- patterns/planner-coder-verifier-router-loop — sibling multi-agent pattern organised by loop role.
- concepts/llm-as-judge — the judge-pass discipline inside the coordinator.
- concepts/shared-context-fan-out — the file-on-disk optimisation that makes N-sub-agent costs tolerable.
- systems/cloudflare-ai-code-review — canonical instance.
- systems/opencode — the coding-agent substrate.