PATTERN Cited by 2 sources

Multi-agent coordination over streaming¶

Pattern¶

In multi-agent systems, agents need to communicate, hand off work, and synchronise. The streaming-broker pattern is to treat multi-agent coordination as a microservices-over-Kafka problem: each agent is a producer and/or consumer of events on streaming topics; coordination is via topic publication + subscription, not point-to-point RPC or shared state.

agent A ─publish─► topic(plan-ready) ◄─subscribe─ agent B
                                     ◄─subscribe─ agent C

agent B ─publish─► topic(result-b)   ◄─subscribe─ agent D (aggregator)
agent C ─publish─► topic(result-c)   ◄─subscribe─ agent D

This gives multi-agent systems the same benefits microservices got from Kafka a decade ago: decoupled services, durability, fan-in, and fan-out.

Canonical statement¶

From Tyler Akidau's 2026-02-10 Redpanda post (Source: sources/2026-02-10-redpanda-how-to-safely-deploy-agentic-ai-in-the-enterprise):

"Multi-agent coordination seems like another classic streaming use case. If you think about the microservices architecture, you get benefits like decoupled services, durability, and fan-in and fan-out inputs. Multi-agent scenarios also require scalable, decoupled communication. With streaming, you get easier maintenance and better durability for your multi-agent system."

Four properties the streaming broker delivers¶

Decoupled services. Agent A publishes to a topic; it doesn't know (and doesn't need to know) which agents subscribe. Agent B subscribes to the topic; it doesn't know who the producers are. Adding a new agent is a new subscriber, not a producer-side rewrite.
Durability. If agent B is offline when A publishes, the event is retained on the topic. B catches up when it restarts. This is load-bearing for long-running multi-agent workflows where individual agent sessions are shorter than the workflow.
Fan-out. A single producer can feed N consumers. One "plan-ready" event from a coordinator agent can trigger N worker agents in parallel.
Fan-in. N producers can feed one aggregator. M worker agents can each emit their partial result; one aggregator consumer subscribes to the common topic and assembles the final answer.

Why decoupled > RPC for multi-agent¶

Synchronous RPC between agents is the naïve shape: agent A calls agent B directly; A blocks until B responds. Problems:

Tight coupling of availability. A can't make progress if B is down, even if A's work is otherwise ready to proceed.
No durability on crash. If A crashes mid-call to B, the work is lost; no shared substrate to resume from.
Scatter-gather complexity. Coordinating M-way fan-out + fan-in requires A to track every outstanding call; any partial failure needs bespoke retry logic.
Back-pressure awkwardness. A has to implement rate-limiting against B; every new A needs its own rate-limiter.
Observability scattered. Each A→B RPC is its own trace; cross-agent causality lives in distributed-tracing system.

Topic-mediated coordination retires all five problems. The broker is the durability boundary; consumer-group semantics handle fan-out back-pressure; the topic log is the unified observability surface.

Composition with the audit envelope¶

The multi-agent coordination topics naturally compose with the audit envelope:

Every agent's published event is captured for audit by construction — it's already on a durable, queryable log.
Cross-agent workflows can be replayed end-to-end against the topic sequence.
Lineage is explicit: each event's headers / schema carries the producing agent ID, the task ID, the upstream trigger.

This is the composition advantage of one substrate, many views — the same topic that carries coordination traffic also carries the audit trail.

Structural sibling: microservices over Kafka¶

The pattern is structurally identical to the microservices-over-Kafka shape that the streaming community already canonicalised:

Microservices	Multi-agent
Service A	Agent A
Service B	Agent B
Event-driven communication	Event-driven communication
Kafka topic	Kafka topic
Consumer groups	Consumer groups (with each agent type as a group)
Schema registry	Schema registry (for agent event shapes)
Back-pressure via consumer lag	Back-pressure via consumer lag

The implication: multi-agent systems don't need a new coordination substrate. The streaming-broker primitives that power thousands of microservices deployments today also power multi-agent agent deployments, at the cost of framing shift from "services" to "agents".

Trade-offs¶

Wins - Proven substrate (Kafka, Redpanda, etc.) with known operational shape. - Durable, queryable coordination record (audit + replay for free). - Fan-out + fan-in primitives native. - Agents can crash and restart without losing workflow state. - Decoupled scaling: coordinator agents, worker agents, aggregator agents all scale independently.

Costs - Async-first mental model. Teams used to synchronous RPC have to rewire to event-driven thinking. Error handling is different (retry + dead-letter vs. exception). - Event schema management. Agents need a shared schema for coordination events; schema drift across agents is the new cross-service-API-break. - Ordering semantics. Multi-agent workflows with strict ordering requirements (e.g. approval chains) need per-key partitioning + single-consumer-per-partition discipline to preserve order. - Exactly-once across agents is hard. If an agent is non-idempotent (e.g. makes an external API call with side-effects), the broker's exactly-once-delivery semantics don't help; each agent needs its own idempotency key / deduplication. - Latency floor. Async coordination adds broker-hop latency (milliseconds). For use cases requiring sub-millisecond coordination (rare at the agent altitude), direct RPC may still be preferable.

concepts/agentic-ai-infrastructure-challenges — axis 8 (multi-agent coordination) of Akidau's eight-axis checklist.
concepts/streaming-as-agile-data-platform-backbone — the structural framing this pattern instantiates at the agent altitude.
concepts/durable-execution — sibling coordination substrate (Temporal-style) for single-workflow durability; multi-agent streaming coordination is the distributed-workflow across agents dual.
patterns/durable-event-log-as-agent-audit-envelope — the audit / replay / lineage shape this pattern composes with by construction.
patterns/cdc-fanout-single-stream-to-many-consumers — data-fan-out analogue; multi-agent coordination is control-fan-out on the same substrate.
patterns/mcp-as-centralized-integration-proxy — the tool-side peer of multi-agent coordination; MCP proxies connect agents to external systems, multi-agent coordination connects agents to each other.

Seen in¶

sources/2026-02-10-redpanda-how-to-safely-deploy-agentic-ai-in-the-enterprise — canonical wiki introduction. Akidau names multi-agent coordination as axis 8 of his eight-axis enterprise-agent- infrastructure checklist, framing it as "another classic streaming use case" that inherits the decoupled-services + durability + fan-in + fan-out benefits from the microservices-over-Kafka lineage.
sources/2026-04-23-aws-modernizing-kyc-with-aws-serverless-solutions-and-agentic-ai — large-scale named AWS instance: MSK-backed multi-agent coordination for a Supervisor + five KYC sub-agents with inbound / outbound topic pairing and Lambda-mediated async AgentCore invocation. Canonical AWS instantiation of this pattern for regulated financial services.