Skip to content

PATTERN Cited by 10 sources

Specialized agent decomposition

Build per-domain agents (storage, databases, client-side traffic, network, …) that each carry a small, well-scoped toolset, and let them collaborate on an end-to-end analysis — rather than building one mega-agent that carries every tool and context for every domain.

Intent

A single general-purpose agent suffers two failure modes as it grows:

  • Tool-selection noise. Large tool inventories make the LLM more likely to pick wrong / less-optimal tools.
  • Context crowding. Packing domain-specific system prompts into one context dilutes each domain's instructions and hits context-window limits.

Decomposition puts each domain's tools and prompts in a dedicated agent whose reasoning space is small, then composes their outputs for cross-domain investigations.

When to reach for it

  • You already have a patterns/tool-decoupled-agent-framework: adding an agent ≈ adding a configuration, not a new codebase.
  • Debugging / investigation spans multiple subsystems (e.g., DB + client traffic + storage).
  • You observe tool-selection errors correlated with tool-inventory growth.

Mechanism

  1. Carve along coherent domains. Each agent owns a specific scope: one system-and-database agent, one client-traffic agent, etc. Tools within an agent are cohesive.
  2. Shared infrastructure. Framework (LLM client, conversation state, tool-call parser, snapshot/replay harness) lives once; each agent instantiates it.
  3. Collaboration protocol. Either an orchestrator agent routes questions to specialists and merges outputs, or specialists hand off to each other via well-defined events. Databricks' post describes collaboration but doesn't spec the protocol.
  4. Per-agent evaluation. Each agent has its own snapshot-replay corpus (see patterns/snapshot-replay-agent-evaluation); specialization makes eval more tractable, not less.

Why it helps

  • Deep expertise per agent. Smaller tool inventory + focused prompt + focused eval corpus = better domain accuracy.
  • Parallel team development. Different teams can own different specialist agents.
  • Incremental rollout. New domains get their own agent without destabilizing existing ones.
  • Extensibility beyond original scope. Once a few agents exist, adding one for a new system (say, caching, or Kubernetes) is a well-defined template.

Tradeoffs

  • Orchestration overhead. Cross-domain questions now require coordination — "this is a DB issue triggered by a client-side surge" requires both agents. Poorly designed coordination layers regress latency and UX.
  • Consistency. Multiple specialists can return overlapping or contradictory diagnoses. Need a reconciliation step or primary-agent mechanic.
  • Boundary drift. A signal that looks like a DB issue may actually live in client traffic; agents must know when to hand off.

Seen in

  • sources/2025-12-03-databricks-ai-agent-debug-databases — Databricks' systems/storex enables "specialized agents for different domains: one focused on system and database issues, another on client-side traffic patterns, and so on. This decomposition enables each agent to build deep expertise in its area while collaborating with others to deliver a more complete root cause analysis. It also paves the way for integrating AI agents into other parts of our infrastructure, extending beyond databases."

  • sources/2026-04-20-cloudflare-orchestrating-ai-code-review-at-scale — Cloudflare's AI Code Review system is the canonical wiki instance of the pattern applied to code review. Seven specialised sub-reviewers (security, performance, code quality, documentation, release, AGENTS.md, engineering- codex) run in parallel, each with a tightly scoped prompt and an explicit "What NOT to flag" section. Coordinated by a judge-pass coordinator on the top model tier. See the specialisation-dedicated pattern patterns/specialized-reviewer-agents and the orchestration shape patterns/coordinator-sub-reviewer-orchestration. Production scale (first 30 days): 131,246 runs across 5,169 repos; 85.7% prompt-cache hit rate; ~1.2 findings per review.

  • sources/2025-12-01-slack-streamlining-security-investigations-with-agentscanonical security-investigation instance. Slack's Spear applies the pattern at the Expert-agent layer: four specialised security-domain experts (Access — authentication/authorization/perimeter, Cloud — infrastructure/compute/orchestration/networking, Code — source-code + configuration-management analysis, Threat — threat analysis + intelligence). Each Expert owns a distinct toolset / data-source set tied to its domain; the Director broadcasts a question to all four in discovery phase and picks one in trace phase. Distinct architectural sibling structure: Slack composes the peer-Expert layer with a supra-agent (Director) + a meta-agent (Critic) rather than the Databricks peer-collaboration framing — see patterns/director-expert-critic-investigation-loop for the three-persona shape. Cost-tier strategy is explicit (knowledge pyramid): Experts on cheap models because leaf work is tool-call-heavy and cognitively shallow; Critic on mid-tier; Director on top-tier. Canonical emergent-behaviour payoff: Critic caught a credential exposure the Expert missed, Director pivoted the investigation.

  • sources/2025-11-17-dropbox-how-dash-uses-context-engineering-for-smarter-ai — Dropbox Dash extracts query construction for its universal search tool into a dedicated search sub-agent. The main planning agent decides when to search; the sub-agent owns the how (user-intent → index-field mapping, query rewriting for semantic matching, typos / synonyms / implicit context). Named rationale: "When a tool demands too much explanation or context to be used effectively, it's often better to turn it into a dedicated agent with a focused prompt." This is the pattern applied to sub-tool complexity, not just domain separation — same shape, different motivation from Storex.

  • sources/2026-01-28-dropbox-knowledge-graphs-mcp-dspy-dash — Josh Clemm's companion talk extends the Dash decomposition with an additional mechanism: a classifier picks the sub-agent for complex agentic queries, each sub-agent having a much narrower tool set. "We use a lot of sub-agents for very complex agentic queries, and have a classifier effectively pick the sub-agent with a much more narrow set of tools." This adds a named routing mechanism to the pattern (the classifier) that the 2025-11-17 post didn't explicitly describe; it also positions specialized-agent-decomposition as one of four named fixes Dash applied to make MCP work at scale (alongside patterns/unified-retrieval-tool, knowledge-graph-bundle token compression, and tool-result-local-storage).

Two framings of the same pattern

  • Domain-based decomposition (Storex). One agent per domain (DB, client traffic, storage, network); composition layer routes cross-domain questions. Intent: scale tool inventory + prompt specialization across many areas of expertise.
  • Sub-tool decomposition (Dash). Extract one specific tool's own internal complexity into a sub-agent, because the tool's explanation otherwise starves the parent's context budget. Intent: protect context budget when a single tool's instruction weight grows.

Both converge on the same mechanism (dedicated prompt + dedicated tool surface + orchestration hand-off) for different reasons. A mature production system often does both — per-domain agents plus, within each, sub-agents for the most complex sub-tasks.

AWS reference-architecture shape (2025-12-11)

AWS's conversational-observability Strands deployment adopts this pattern with a three-agent split over Kubernetes troubleshooting:

  • Agent Orchestrator — coordinates the troubleshooting workflow across the other two agents. "Coordinates troubleshooting workflows."
  • Memory Agent — owns conversation context and historical insights across turns / sessions. "Manages conversation context and historical insights."
  • K8s Specialist — narrow-surface diagnostic agent calling EKS MCP Server tools. "Handles Kubernetes diagnostics."

The decomposition mirrors Storex (per-storage-layer specialists), Dash (classifier-routed sub-agents), and Cloudflare Agent Lee (domain-per-team agents): same pattern, same rationale — keeping each agent's tool inventory small enough for reliable selection and small enough to fit in context. Operational-ops instance of the pattern, same shape as Storex's storage-incident instance. (Source: sources/2025-12-11-aws-architecting-conversational-observability-for-cloud-applications)

Verification-gated inner-loop variant (DS-STAR, 2025-11-06)

Google Research's DS-STAR data-science agent is the canonical wiki instance of specialised-agent decomposition organised by role in a refinement loop, not by subject-matter domain:

  • Data File Analyzer — writes + runs a file-summarisation script.
  • Planner — emits the high-level plan.
  • Coder — turns the plan into executable code, runs it.
  • VerifierLLM judge scoring plan sufficiency against intermediate results.
  • Router — on reject, decides add-step vs fix-step.

The agents specialise not because they own different domains but because they own different roles in the plan → implement → verify → refine loop; the Verifier gates each cycle, and the Router's add-or-fix decision is the refinement primitive. Full pattern is patterns/planner-coder-verifier-router-loop; loop-level concept is concepts/iterative-plan-refinement.

Ablations quantify the decomposition's value: removing the Data File Analyzer collapses DABStep hard-task accuracy 45.2 % → 26.98 %; removing the Router (forcing extend-only) degrades both easy and hard tasks — "it is more effective to correct mistakes in a plan than to keep adding potentially flawed steps" (Source: sources/2025-11-06-google-ds-star-versatile-data-science-agent).

Adds a third framing to this pattern's taxonomy alongside the Storex (domain-based) and Dash (sub-tool) framings:

  • Role-in-the-refinement-loop decomposition (DS-STAR). One agent per loop role (context, plan, implement, judge, route); coordination is the loop itself. Intent: make verification and revision first-class, isolate ablation-testable primitives.

Offline-context-generation framing (Meta, 2026-04-06)

Meta's AI Pre-Compute Engine is the canonical wiki instance of the pattern applied offline — a one-session orchestration of 50+ specialised agents that reads a 4-repo / 4,100-file config-as-code data pipeline and emits 59 context files plus a dependency graph. Nine named roles:

  • Explorers (2) — map the codebase.
  • Module analysts (11) — apply the five-questions framework per module.
  • Writers (2) — synthesise the 59 compass-shaped context files.
  • Critics (10+ across 3 rounds) — independent quality review.
  • Fixers (4) — apply corrections.
  • Upgraders (8) — refine the orchestration / routing layer.
  • Prompt testers (3) — validate 55+ queries × 5 personas.
  • Gap-fillers (4) — cover remaining directories.
  • Final critics (3) — integration tests.

This is the fourth framing: the agents specialise by pipeline stage in an offline knowledge-extraction flow. The Storex / Dash / DS-STAR framings are runtime decompositions; Meta's is a one-shot offline orchestration whose output is a durable artifact (the 59 files) that downstream runtime agents consume. Delivers measurable outcomes: critic quality 3.65 → 4.20 / 5.0 across 3 rounds, zero hallucinated file paths, ~40 % fewer tool calls per task on a six-task preliminary eval (Source: sources/2026-04-06-meta-how-meta-used-ai-to-map-tribal-knowledge-in-large-scale-data-pipelines).

Skill-over-shared-tools framing (Meta Capacity Efficiency, 2026-04-16)

Meta's Capacity Efficiency Platform is the canonical wiki instance of the pattern applied as skill-based composition over a shared MCP tool layer. The same five tools (profiling · experiments · config history · code search · documentation) serve every specialist agent; the agents differ only in their skill bundle (named encoded domain expertise).

At least seven specialist agents share the platform:

Meta's explicit claim: "each new capability requires few to no new data integrations since they can just compose existing tools with new skills." This is a fifth framing to this pattern's taxonomy: the agents specialise by skill bundle, not by subject-matter domain (Storex) / sub-tool complexity (Dash) / refinement-loop role (DS-STAR) / offline-pipeline stage (Pre-Compute Engine). Differentiator: the tool layer is shared across specialists, and new agents cost only new skill-authoring, not new data integration (Source: sources/2026-04-16-meta-capacity-efficiency-at-meta-how-unified-ai-agents-optimize-performance-at-hyperscale).

See patterns/mcp-tools-plus-skills-unified-platform for the full architectural pattern.

Regulated-financial-services framing (IBM + AWS KYC, 2026-04-23)

The IBM + AWS KYC architecture is the canonical wiki instance of specialised-agent decomposition applied to regulated compliance workflows on Bedrock AgentCore. One KYC Orchestration Supervisor + five domain sub-agents:

  • Identity Verification — watchlist / sanctions APIs, name- variation NLP.
  • Document Analysis — OCR, multi-language, watermark / security-feature forgery detection.
  • Fraud Detection — behavioural analysis, same-IP / same- device collision detection, semantic-similarity search over historical fraud, dynamic risk scoring with explainable assessments.
  • Compliance & Risk — jurisdiction-specific regulatory interpretation (BSA, USA PATRIOT Act, AMLD, MAS, FATF), attestation generation with audit trails.
  • Customer Experience — real-time friction-point detection, abandonment-reduction recommendations.

Supervisor does no compliance work itself; it dynamically constructs a parallel-or-sequential execution plan per case based on document types, geography, risk indicators, and historical patterns. This is a sixth framing in this pattern's taxonomy: regulated-compliance decomposition — agents specialise by regulatory sub-domain, each with a narrowly-scoped OpenAPI tool surface enforced at runtime by systems/agentcore-identity + systems/agentcore-gateway. The decomposition is the key to the sub-5-minute latency target (patterns/parallel-subagent-execution-for-latency) and to the composed confidence-tiered routing (>95 / 75-95 / <75) that the Supervisor applies to sub-agent outputs.

Full pattern: patterns/supervisor-subagent-kyc-orchestration. (Source: sources/2026-04-23-aws-modernizing-kyc-with-aws-serverless-solutions-and-agentic-ai.)

Last updated · 542 distilled / 1,571 read