PATTERN Cited by 1 source
Parallel sub-agent execution for latency¶
Pattern¶
When a multi-agent orchestration breaks a task into N sub-agent
calls that are data-independent, invoke them in parallel
rather than sequentially. Case-level latency becomes
max(sub-agent-latency) instead of sum(sub-agent-latency) — the
single biggest latency win available in multi-sub-agent
architectures.
Canonical framing, IBM + AWS KYC architecture (2026-04-23): "The parallel agent execution model is designed to reduce KYC validation time from the typical 3-5 days to near-real time for standard cases. This approach enables exponentially faster processing through simultaneous operation of Document Analysis, Identity Verification, and Fraud Detection agents rather than sequential workflows." (Source: sources/2026-04-23-aws-modernizing-kyc-with-aws-serverless-solutions-and-agentic-ai.)
Sequential vs parallel in concrete shape¶
SEQUENTIAL (wall-clock time = A + B + C):
[ DocAnalysis ────][ IdentityVerify ──────][ FraudDetect ────────]
PARALLEL (wall-clock time = max(A, B, C)):
[ DocAnalysis ────]
[ IdentityVerify ──────]
[ FraudDetect ────────]
▼
Supervisor composes
With the three KYC sub-agents averaging say 2s, 3s, 4s respectively: - Sequential: 9s - Parallel: 4s
The advantage grows with sub-agent count. Five parallel sub-agents at ~5s each = 5s wall-clock instead of 25s.
Preconditions¶
Data independence is the hard requirement:
- Sub-agents must not need each other's outputs as input.
- Tool calls must not depend on ordering (e.g. don't activate the account before identity is verified).
- Each sub-agent's confidence score must be independently meaningful — the Supervisor composes them at the end.
Where preconditions fail: - Document Analysis → Identity Verification is not parallelisable if Identity needs the extracted passport number from Document Analysis. - Compliance & Risk may depend on all three earlier sub-agents completing, so sits sequentially after the parallel fan-out. - Customer Experience is observational and parallel-safe.
The KYC architecture picks Doc Analysis + Identity Verify + Fraud Detection as the parallelisable set specifically because they share inputs (customer profile + documents) but not outputs — each produces a confidence signal the Supervisor can combine.
Supervisor as the parallelisation planner¶
The Supervisor owns the parallelisation decision:
"The supervisor analyses case characteristics (document types, customer geography, risk indicators, and historical patterns) to construct context-aware execution plans that invoke sub-agents in parallel or sequentially based on dependencies."
Not every case parallelises the same way. A high-risk case might run Fraud Detection first sequentially and short-circuit on detection, then only spawn other sub-agents if Fraud Detection clears. A low-risk case runs everything in parallel. The Supervisor's dynamic planning (patterns/supervisor-subagent-kyc-orchestration) is what makes this choice per-case.
Latency composition hazards¶
Tail-latency amplification. max(A, B, C) is sensitive to the
tail of each sub-agent — a p99 outage on Fraud Detection becomes
the case's p99. Mitigations:
- Per-sub-agent timeout budgets (if a sub-agent exceeds its budget, the Supervisor proceeds with partial confidence and a lower confidence tier).
- Hedged tool calls inside individual sub-agents (redundant invoke after short delay). See concepts/tail-latency-at-scale.
- Sub-agent-level caching where idempotent (e.g. pre-cached watchlist results for the same identity).
Resource contention. Parallel sub-agents sharing a common rate-limited downstream (same vendor API, same model endpoint) serialise behind the rate limit. Either fan out to distinct endpoints or let the rate limiter + Supervisor downgrade gracefully.
Composition rule dominance. If the Supervisor uses min-confidence across sub-agents, a single slow-confidence sub-agent dominates. If it uses weighted average, it's more forgiving. Composition rule has to be aligned with regulatory risk appetite — documented, not just coded.
When not to parallelise¶
- Sub-agents have genuine data dependencies: parallelising creates wrong-order reads.
- Regulatory sequence requirements: some jurisdictions require sequential gating (e.g. sanctions screening must complete before any other KYC step). Parallel → non-compliant.
- Side-effect risk: sub-agents that write to the same state should not run concurrently without explicit conflict handling.
- Cost vs latency: parallel fan-out costs N× compute for potentially small latency wins when sub-agents are already fast.
Related patterns¶
- patterns/two-loop-parallel-async-build — Yelp CHAOS's client-side version of the same idea: loop 1 starts all async calls, loop 2 awaits them; sync-looking API.
- patterns/multi-agent-streaming-coordination — pairs well because Kafka fan-out naturally supports parallel subscription-based dispatch.
- patterns/supervisor-subagent-kyc-orchestration — this pattern is the latency mechanism inside that orchestration.
Seen in¶
- sources/2026-04-23-aws-modernizing-kyc-with-aws-serverless-solutions-and-agentic-ai — three parallel sub-agents (Doc Analysis + Identity Verify + Fraud Detect) as the core mechanism behind 3-5-day → sub-5-minute KYC latency target.