CONCEPT Cited by 1 source

Shared-context fan-out¶

Shared-context fan-out is the pattern of writing large common context (merge-request metadata, previous review findings, diff summaries) to a shared file on disk and pointing multiple concurrent sub-agents at it via their file-read tool, rather than embedding the context in every sub-agent's prompt. The shared file is read once per agent via a tool call; the prompt stays small.

The token-cost problem it solves¶

A naïve N-sub-agent fan-out embeds the full MR context in each sub-agent's prompt:

agent 1 prompt: [base] + [agent 1 specialism] + [full MR context]
agent 2 prompt: [base] + [agent 2 specialism] + [full MR context]
...
agent N prompt: [base] + [base] + [base]  + [full MR context]

If each sub-agent runs independently, the same MR context is billed N times. Cloudflare's framing: "Duplicating even a moderately-sized MR context across seven concurrent reviewers would multiply our token costs by 7x."

The fan-out shape¶

Orchestrator assembles the shared context once — in Cloudflare's case, a file literally named shared-mr-context.txt.
Orchestrator also writes per-file patch files to a diff_directory so each sub-agent can read only the patches relevant to its specialism.
Each sub-agent prompt references the path ("read shared-mr-context.txt and the patches for files X, Y, Z from diff_directory/") rather than embedding content.
Sub-agents use their built-in file-read tool on the shared paths.
Provider-side prompt caching then kicks in: the base agent prompts are identical across all runs + the shared-mr-context.txt path is a cacheable input, so cross-request cache hits are dramatic.

Observed effect¶

Cloudflare's production numbers validate the design: 85.7% prompt-cache hit rate across ~120 B tokens processed in the first 30 days. Cache reads dominate cache writes by roughly 10:1. The savings are quantified: "this saves us an estimated five figures compared to what we would pay at full input token pricing."

Why the file-on-disk approach beats alternatives¶

Alternative	Drawback
Embed full context in every sub-prompt	7× token cost on the context portion
Pass context via stdin to each sub-agent	Loses prompt-caching; every sub-agent's prompt varies
Pin context in a KV / vector store	Adds a network round-trip + cache-miss class; more infra
Include context as a system message	Same token multiplication as embedding

File-on-disk is: cheap (one write), provider-cacheable (same path → same content → cache hit), tool-uniform (sub-agents already have file-read), and shared-memory-like (no duplication across processes).

Per-file patches in diff_directory — each sub-reviewer reads only relevant files, not the whole diff. Fan-out by specialism at the file level.
Same base prompts across all runs — sub-agent system prompts are stable; combined with the shared context file, they maximise what the provider's cache can hit.
Session affinity — routes repeated requests to the same backend so the provider's local cache is reused (sibling optimisation; additive with shared-context fan-out).

Generalisation¶

Any orchestrator that fans out to N sub-agents over a shared input benefits:

Multi-agent research pipelines that share a corpus.
RAG systems where N downstream specialists use the same retrieved chunks.
Judge / verifier pipelines that score the same trajectory from multiple angles.
Fan-out content-moderation that runs N classifiers on the same post.

Rule: if multiple agents read the same context, put it on disk once and point everyone at the path.

Seen in¶

sources/2026-04-20-cloudflare-orchestrating-ai-code-review-at-scale — canonical named instance; 7-sub-reviewer fan-out; 85.7% prompt-cache hit rate attributed partly to this pattern.

systems/cloudflare-ai-code-review — the production consumer.
concepts/session-affinity-prompt-caching — the sibling optimisation that ensures the cache is reused across requests.
patterns/coordinator-sub-reviewer-orchestration — the orchestration shape the fan-out sits inside.
patterns/specialized-reviewer-agents — the sub-agent structure shared context supports.