CONCEPT Cited by 1 source
Shared-context fan-out¶
Shared-context fan-out is the pattern of writing large common context (merge-request metadata, previous review findings, diff summaries) to a shared file on disk and pointing multiple concurrent sub-agents at it via their file-read tool, rather than embedding the context in every sub-agent's prompt. The shared file is read once per agent via a tool call; the prompt stays small.
The token-cost problem it solves¶
A naïve N-sub-agent fan-out embeds the full MR context in each sub-agent's prompt:
agent 1 prompt: [base] + [agent 1 specialism] + [full MR context]
agent 2 prompt: [base] + [agent 2 specialism] + [full MR context]
...
agent N prompt: [base] + [base] + [base] + [full MR context]
If each sub-agent runs independently, the same MR context is billed N times. Cloudflare's framing: "Duplicating even a moderately-sized MR context across seven concurrent reviewers would multiply our token costs by 7x."
The fan-out shape¶
- Orchestrator assembles the shared context once — in Cloudflare's case, a file literally named
shared-mr-context.txt. - Orchestrator also writes per-file patch files to a
diff_directoryso each sub-agent can read only the patches relevant to its specialism. - Each sub-agent prompt references the path ("read
shared-mr-context.txtand the patches for filesX, Y, Zfromdiff_directory/") rather than embedding content. - Sub-agents use their built-in file-read tool on the shared paths.
- Provider-side prompt caching then kicks in: the base agent prompts are identical across all runs + the
shared-mr-context.txtpath is a cacheable input, so cross-request cache hits are dramatic.
Observed effect¶
Cloudflare's production numbers validate the design: 85.7% prompt-cache hit rate across ~120 B tokens processed in the first 30 days. Cache reads dominate cache writes by roughly 10:1. The savings are quantified: "this saves us an estimated five figures compared to what we would pay at full input token pricing."
Why the file-on-disk approach beats alternatives¶
| Alternative | Drawback |
|---|---|
| Embed full context in every sub-prompt | 7× token cost on the context portion |
| Pass context via stdin to each sub-agent | Loses prompt-caching; every sub-agent's prompt varies |
| Pin context in a KV / vector store | Adds a network round-trip + cache-miss class; more infra |
| Include context as a system message | Same token multiplication as embedding |
File-on-disk is: cheap (one write), provider-cacheable (same path → same content → cache hit), tool-uniform (sub-agents already have file-read), and shared-memory-like (no duplication across processes).
Related optimisations in the same system¶
- Per-file patches in
diff_directory— each sub-reviewer reads only relevant files, not the whole diff. Fan-out by specialism at the file level. - Same base prompts across all runs — sub-agent system prompts are stable; combined with the shared context file, they maximise what the provider's cache can hit.
- Session affinity — routes repeated requests to the same backend so the provider's local cache is reused (sibling optimisation; additive with shared-context fan-out).
Generalisation¶
Any orchestrator that fans out to N sub-agents over a shared input benefits:
- Multi-agent research pipelines that share a corpus.
- RAG systems where N downstream specialists use the same retrieved chunks.
- Judge / verifier pipelines that score the same trajectory from multiple angles.
- Fan-out content-moderation that runs N classifiers on the same post.
Rule: if multiple agents read the same context, put it on disk once and point everyone at the path.
Seen in¶
- sources/2026-04-20-cloudflare-orchestrating-ai-code-review-at-scale — canonical named instance; 7-sub-reviewer fan-out; 85.7% prompt-cache hit rate attributed partly to this pattern.
Related¶
- systems/cloudflare-ai-code-review — the production consumer.
- concepts/session-affinity-prompt-caching — the sibling optimisation that ensures the cache is reused across requests.
- patterns/coordinator-sub-reviewer-orchestration — the orchestration shape the fan-out sits inside.
- patterns/specialized-reviewer-agents — the sub-agent structure shared context supports.