PATTERN Cited by 6 sources
Specialized agent decomposition¶
Build per-domain agents (storage, databases, client-side traffic, network, …) that each carry a small, well-scoped toolset, and let them collaborate on an end-to-end analysis — rather than building one mega-agent that carries every tool and context for every domain.
Intent¶
A single general-purpose agent suffers two failure modes as it grows:
- Tool-selection noise. Large tool inventories make the LLM more likely to pick wrong / less-optimal tools.
- Context crowding. Packing domain-specific system prompts into one context dilutes each domain's instructions and hits context-window limits.
Decomposition puts each domain's tools and prompts in a dedicated agent whose reasoning space is small, then composes their outputs for cross-domain investigations.
When to reach for it¶
- You already have a patterns/tool-decoupled-agent-framework: adding an agent ≈ adding a configuration, not a new codebase.
- Debugging / investigation spans multiple subsystems (e.g., DB + client traffic + storage).
- You observe tool-selection errors correlated with tool-inventory growth.
Mechanism¶
- Carve along coherent domains. Each agent owns a specific scope: one system-and-database agent, one client-traffic agent, etc. Tools within an agent are cohesive.
- Shared infrastructure. Framework (LLM client, conversation state, tool-call parser, snapshot/replay harness) lives once; each agent instantiates it.
- Collaboration protocol. Either an orchestrator agent routes questions to specialists and merges outputs, or specialists hand off to each other via well-defined events. Databricks' post describes collaboration but doesn't spec the protocol.
- Per-agent evaluation. Each agent has its own snapshot-replay corpus (see patterns/snapshot-replay-agent-evaluation); specialization makes eval more tractable, not less.
Why it helps¶
- Deep expertise per agent. Smaller tool inventory + focused prompt + focused eval corpus = better domain accuracy.
- Parallel team development. Different teams can own different specialist agents.
- Incremental rollout. New domains get their own agent without destabilizing existing ones.
- Extensibility beyond original scope. Once a few agents exist, adding one for a new system (say, caching, or Kubernetes) is a well-defined template.
Tradeoffs¶
- Orchestration overhead. Cross-domain questions now require coordination — "this is a DB issue triggered by a client-side surge" requires both agents. Poorly designed coordination layers regress latency and UX.
- Consistency. Multiple specialists can return overlapping or contradictory diagnoses. Need a reconciliation step or primary-agent mechanic.
- Boundary drift. A signal that looks like a DB issue may actually live in client traffic; agents must know when to hand off.
Seen in¶
-
sources/2025-12-03-databricks-ai-agent-debug-databases — Databricks' systems/storex enables "specialized agents for different domains: one focused on system and database issues, another on client-side traffic patterns, and so on. This decomposition enables each agent to build deep expertise in its area while collaborating with others to deliver a more complete root cause analysis. It also paves the way for integrating AI agents into other parts of our infrastructure, extending beyond databases."
-
sources/2026-04-20-cloudflare-orchestrating-ai-code-review-at-scale — Cloudflare's AI Code Review system is the canonical wiki instance of the pattern applied to code review. Seven specialised sub-reviewers (security, performance, code quality, documentation, release, AGENTS.md, engineering- codex) run in parallel, each with a tightly scoped prompt and an explicit "What NOT to flag" section. Coordinated by a judge-pass coordinator on the top model tier. See the specialisation-dedicated pattern patterns/specialized-reviewer-agents and the orchestration shape patterns/coordinator-sub-reviewer-orchestration. Production scale (first 30 days): 131,246 runs across 5,169 repos; 85.7% prompt-cache hit rate; ~1.2 findings per review.
-
sources/2025-11-17-dropbox-how-dash-uses-context-engineering-for-smarter-ai — Dropbox Dash extracts query construction for its universal search tool into a dedicated search sub-agent. The main planning agent decides when to search; the sub-agent owns the how (user-intent → index-field mapping, query rewriting for semantic matching, typos / synonyms / implicit context). Named rationale: "When a tool demands too much explanation or context to be used effectively, it's often better to turn it into a dedicated agent with a focused prompt." This is the pattern applied to sub-tool complexity, not just domain separation — same shape, different motivation from Storex.
-
sources/2026-01-28-dropbox-knowledge-graphs-mcp-dspy-dash — Josh Clemm's companion talk extends the Dash decomposition with an additional mechanism: a classifier picks the sub-agent for complex agentic queries, each sub-agent having a much narrower tool set. "We use a lot of sub-agents for very complex agentic queries, and have a classifier effectively pick the sub-agent with a much more narrow set of tools." This adds a named routing mechanism to the pattern (the classifier) that the 2025-11-17 post didn't explicitly describe; it also positions specialized-agent-decomposition as one of four named fixes Dash applied to make MCP work at scale (alongside patterns/unified-retrieval-tool, knowledge-graph-bundle token compression, and tool-result-local-storage).
Two framings of the same pattern¶
- Domain-based decomposition (Storex). One agent per domain (DB, client traffic, storage, network); composition layer routes cross-domain questions. Intent: scale tool inventory + prompt specialization across many areas of expertise.
- Sub-tool decomposition (Dash). Extract one specific tool's own internal complexity into a sub-agent, because the tool's explanation otherwise starves the parent's context budget. Intent: protect context budget when a single tool's instruction weight grows.
Both converge on the same mechanism (dedicated prompt + dedicated tool surface + orchestration hand-off) for different reasons. A mature production system often does both — per-domain agents plus, within each, sub-agents for the most complex sub-tasks.
AWS reference-architecture shape (2025-12-11)¶
AWS's conversational-observability Strands deployment adopts this pattern with a three-agent split over Kubernetes troubleshooting:
- Agent Orchestrator — coordinates the troubleshooting workflow across the other two agents. "Coordinates troubleshooting workflows."
- Memory Agent — owns conversation context and historical insights across turns / sessions. "Manages conversation context and historical insights."
- K8s Specialist — narrow-surface diagnostic agent calling EKS MCP Server tools. "Handles Kubernetes diagnostics."
The decomposition mirrors Storex (per-storage-layer specialists), Dash (classifier-routed sub-agents), and Cloudflare Agent Lee (domain-per-team agents): same pattern, same rationale — keeping each agent's tool inventory small enough for reliable selection and small enough to fit in context. Operational-ops instance of the pattern, same shape as Storex's storage-incident instance. (Source: sources/2025-12-11-aws-architecting-conversational-observability-for-cloud-applications)
Verification-gated inner-loop variant (DS-STAR, 2025-11-06)¶
Google Research's DS-STAR data-science agent is the canonical wiki instance of specialised-agent decomposition organised by role in a refinement loop, not by subject-matter domain:
- Data File Analyzer — writes + runs a file-summarisation script.
- Planner — emits the high-level plan.
- Coder — turns the plan into executable code, runs it.
- Verifier — LLM judge scoring plan sufficiency against intermediate results.
- Router — on reject, decides add-step vs fix-step.
The agents specialise not because they own different domains but because they own different roles in the plan → implement → verify → refine loop; the Verifier gates each cycle, and the Router's add-or-fix decision is the refinement primitive. Full pattern is patterns/planner-coder-verifier-router-loop; loop-level concept is concepts/iterative-plan-refinement.
Ablations quantify the decomposition's value: removing the Data File Analyzer collapses DABStep hard-task accuracy 45.2 % → 26.98 %; removing the Router (forcing extend-only) degrades both easy and hard tasks — "it is more effective to correct mistakes in a plan than to keep adding potentially flawed steps" (Source: sources/2025-11-06-google-ds-star-versatile-data-science-agent).
Adds a third framing to this pattern's taxonomy alongside the Storex (domain-based) and Dash (sub-tool) framings:
- Role-in-the-refinement-loop decomposition (DS-STAR). One agent per loop role (context, plan, implement, judge, route); coordination is the loop itself. Intent: make verification and revision first-class, isolate ablation-testable primitives.
Related¶
- patterns/tool-decoupled-agent-framework — the enabling framework.
- patterns/snapshot-replay-agent-evaluation — per-agent eval.
- systems/storex
- systems/strands-agents-sdk
- concepts/agentic-troubleshooting-loop
- patterns/planner-coder-verifier-router-loop — role-in-the-refinement-loop variant of this pattern.
- systems/ds-star — canonical instance of role-based decomposition with inner-loop verification.
- concepts/iterative-plan-refinement — the loop-level discipline.