PATTERN Cited by 2 sources

Context-segregated sub-agents¶

Pattern¶

When an agent needs to do work that (a) would consume large amounts of the main context window, (b) needs a different tool surface than the parent, or (c) is exposed to untrusted input that should not enter the planner's context, spawn a sub-agent with its own fresh context array and its own narrow tool allowlist. The parent invokes the sub-agent as a single call, receives a summary (or structured result), and appends only that summary to the parent's context.

Fly.io's canonical formulation — also the simplest possible demystification of "Claude Code sub-agents":

"Take 'sub-agents'. People make a huge deal out of Claude Code's sub-agents, but you can see now how trivial they are to implement: just a new context array, another call to the model. Give each call different tools. Make sub-agents talk to each other, summarize each other, collate and aggregate." (Source: sources/2025-11-06-flyio-you-should-write-an-agent.)

The pattern rests on the fact that a "conversation" is just a Python list of strings (concepts/agent-loop-stateless-llm) — a sub-agent is a second list with a second LLM call against it.

What segregation buys¶

Token-budget isolation. Exploratory tool calls, dead-ends, and large intermediate outputs stay inside the sub-agent's context. The parent's token budget only spends on the sub-agent's returned summary.
Tool-surface narrowing. Each sub-agent sees only the tools relevant to its concern — a security-review sub-agent gets read-only code-analysis tools; a deployment sub-agent gets write-capable deploy tools. Per-turn schema overhead shrinks and tool-selection accuracy improves.
Security segregation. Fly.io: "You can trivially build an agent with segregated contexts, each with specific tools. That makes LLM security interesting." Hostile or untrusted input (web-search results, user-uploaded files, email bodies, third-party API output) goes into the sub-agent's context where a successful prompt-injection cannot rewrite the parent's plan. Complementary to patterns/untrusted-input-via-file-not-prompt.
Parallelism. Multiple sub-agents can run concurrently over disjoint inputs; results feed a collator sub-agent or the parent.
Specialisation. Each sub-agent gets its own system prompt, its own model choice (cheap model for classification, expensive model for synthesis), its own temperature.

Concrete shapes¶

Fan-out / collate. Parent spawns N sub-agents over partitioned input, collects their summaries, passes them to a collator sub-agent that aggregates.
Planner + executor. Parent plans and dispatches; one or more executor sub-agents do the narrow task with write-capable tools. See patterns/specialized-agent-decomposition.
Coordinator + specialists. Cloudflare AI Code Review canonical instance (patterns/coordinator-sub-reviewer-orchestration): coordinator sub-agent spawns one specialised-reviewer sub-agent per review category, each with its own prompt + tool allowlist, coordinator aggregates findings.
Summarise-then-forget. Sub-agent reads a large corpus (long doc, log file, big search result), returns a summary; the raw corpus never enters any context.
Dual personalities. Two sub-agents with orthogonal prompts on the same input, compare outputs (Fly.io's Alph truth-teller / Ralph liar demo).

Implementation reminder (cheap)¶

From the minimal agent loop:

def sub_agent(task: str, tools_subset: list) -> str:
    ctx = [{"role": "system", "content": SUB_AGENT_PROMPT}]
    ctx.append({"role": "user", "content": task})
    # run the same tool-call loop against ctx + tools_subset
    return final_assistant_text

Fly.io's "your wackiest idea will probably (1) work and (2) take 30 minutes to code." The cost of experimentation is low; the correct move is to try the structure, not to architect it.

Caveats¶

The parent must still not trust sub-agent output blindly. A prompt-injected sub-agent can return a hostile summary. Treat sub-agent output as untrusted input (patterns/llm-output-as-untrusted-input).
Budget can be spent N× across sub-agents. Each sub-agent has its own context and each pays tokens for its system prompt and tool schemas. The segregation buys the parent budget discipline; fleet-level token cost is not automatically lower.
Choose the interchange format deliberately. Fly.io lists it as an open design problem: "what the most reliable intermediate forms are (JSON blobs? SQL databases? Markdown summaries) for interchange between them." Over-structured forms (strict JSON schemas) cost both ends tokens; under- structured forms (prose summaries) lose fidelity.
Early-exit-by-lying-to-self. A sub-agent can report "done" when it isn't. Ground-truth verifiers (tests green, compiler clean, independent judge agent) reduce this (concepts/agentic-development-loop).

Seen in¶

Fly.io, You Should Write An Agent (2025-11-06) — canonical demystification of sub-agents as "just a new context array, another call to the model" + the four named levers (new context, new tools, summarise, collate). (Source: sources/2025-11-06-flyio-you-should-write-an-agent.)
Cloudflare AI Code Review (2026-04-20) — coordinator + N specialised-reviewer sub-agents (patterns/coordinator-sub-reviewer-orchestration + patterns/specialized-reviewer-agents).
Claude Code / Goose sub-agents — the IDE-side productised form Fly.io points at as the anchor for the pattern name (systems/claude-code).
Atlassian Long Horizon (2026-06-18) — spawns child instances (each a full Long Horizon reasoning loop with its own clean context) for wide tasks that decompose into independent research strands. Strands run concurrently; the slowest sets response time. The parent synthesises finished results without carrying every intermediate finding. Structurally the same pattern as Fly.io's sub-agents but the child is a full iterative reasoning loop, not a narrow single-tool call. Complementary to context compaction (which handles depth). (Source: sources/2026-06-18-atlassian-long-horizon-reasoning-engine.)