PATTERN Cited by 1 source
Swarm of discovery agents for context prebuild¶
Intent¶
Run many parallel AI agents against a customer's data-surface-area (telemetry, codebase, document store) as a background task, producing a searchable corpus of structured context ahead of any user question — so that interactive agent queries are served by cheap retrieval over pre-computed memory instead of expensive on-demand exploration.
The "swarm" framing emphasises:
- Parallelism across discovery subtasks — each agent walks a distinct slice of the surface area, and the slices compose into a full corpus.
- Role specialisation — different pipeline stages need different agent capabilities (data-source enumeration, metric extraction, dependency inference, summarisation).
- Background / amortised cost — the swarm runs on its own schedule, not in response to user queries, so inference cost is amortised across many user sessions.
Canonical instance (Grafana Assistant, 2026-05-01)¶
(Source: sources/2026-05-01-grafana-how-grafana-assistant-learns-your-infrastructure-before-you-even-ask)
Grafana's four-stage swarm pipeline for infrastructure memory:
- Data source discovery — identify all connected Prometheus, Loki, and Tempo data sources in the stack.
- Metrics scans — "Agents query your Prometheus data sources in parallel to find services, deployments, and infrastructure components."
- Enrichments via logs and traces — correlate Loki / Tempo data with corresponding Prometheus metrics, adding context about log formats, trace structures, service dependencies.
- Structured knowledge generation — per discovered service group, produce a five-category schema document.
Verbatim framing: "A swarm of AI agents does the heavy lifting."
When to reach for this pattern¶
- Large surface area that is expensive to walk on-demand: a stack with hundreds of services, a codebase with thousands of modules, a knowledge base with many documents. Per-query exploration doesn't scale.
- User questions are predictably-shaped against a precomputable answer schema. "What does X depend on?" → look up X's Dependencies chunk. Not a fit for open-ended, non-schematic queries.
- Expensive LLM inference vs. cheap retrieval. Swarm runs once per refresh cycle at LLM cost; user queries are served by vector-DB lookup at commodity cost.
- Substrate permits parallel enumeration. Data-source APIs, file system walks, database listings — all agent- walkable in parallel.
When to avoid it¶
- Surface area is small. A handful of services / modules / documents — on-demand exploration is cheaper than pre-building a corpus.
- Queries are highly novel. If users ask unanticipated questions that don't map onto a schema, precomputed memory helps less; the agent must fall back to raw exploration anyway.
- Data changes faster than refresh cadence. If state changes within minutes (financial trading, real-time control systems), a weekly memory is actively misleading.
- Inference cost is prohibitive. Swarm inference over thousands of service groups is real LLM spend; for small organisations it may be uneconomical.
Structural properties¶
Parallelism axis¶
The swarm parallelises across the surface, not across queries. Each user query hits a single vector-DB lookup; the parallelism is all in the background build.
Role decomposition¶
Grafana's swarm has four named roles (one per pipeline stage). Each role has narrower context than a general research-agent:
- Discovery agents only need to enumerate data sources.
- Metric-scan agents only need to query Prometheus.
- Enrichment agents only need to correlate three data sources.
- Summarisation agents only need to synthesise the five- category schema.
Role specialisation is an instance of patterns/specialized-agent-decomposition at the background-batch altitude.
Output schema discipline¶
All summarisation agents produce output against the same typed schema (patterns/five-category-service-knowledge-schema). Without schema discipline, the corpus becomes a pile of incommensurable free-form summaries that semantic search can't navigate effectively.
Background cadence¶
The swarm runs on its own schedule (weekly refresh in the Grafana instance) plus optional manual-trigger. User interactions never wait for the swarm.
Relationship to adjacent patterns¶
| Pattern | Relationship |
|---|---|
| patterns/precomputed-agent-context-files | Parent pattern. Meta's per-module markdown files; same idea at a different substrate (code vs. telemetry) and a different output format (freeform markdown vs. typed schema). |
| patterns/parallel-subagent-execution-for-latency | Latency-time counterpart. Parallel subagents to reduce per-query latency; swarm-of-discovery is to reduce aggregate cost across many queries by moving compute offline. |
| patterns/specialized-agent-decomposition | Role shape. Each pipeline stage is a specialised agent; composition makes the swarm. |
| patterns/five-category-service-knowledge-schema | Output contract. The schema the swarm's summarisation stage produces. |
Failure modes¶
- Partial pipeline failure. If stage 3 (enrichment) fails for a service, stage 4 produces a memory with empty Dependencies + Log-structure fields. Handling strategy undisclosed for Grafana Assistant.
- Swarm-scale inference cost. Re-extracting every service group weekly is real LLM spend; cost-per-stack is undisclosed.
- Concurrent-refresh race. If a manual trigger runs alongside the automatic weekly cycle, partial memories may be visible mid-refresh. Concurrency discipline not disclosed.
- Un-instrumented coverage gap. Agents can only discover services that emit metrics — the telemetry-as-substrate precondition. Silent services are invisible.
- Discovery-agent over-access. The swarm must run with enough privilege to read all data sources; this privilege is higher than any individual user's. ACL propagation (concepts/acl-propagated-agent-memory) at retrieval time is the compensating control.
- Summarisation hallucination. Stage-4 LLM agents may hallucinate metric names or dependencies not present in the underlying telemetry. Validation discipline (check every claimed metric name exists in Prometheus before writing memory) undisclosed.
Seen in¶
- sources/2026-05-01-grafana-how-grafana-assistant-learns-your-infrastructure-before-you-even-ask — canonical wiki instance of swarm-of-discovery-agents for context prebuild, at the observability-stack altitude. Four-stage pipeline (data source discovery → metrics scans → log/trace enrichment → structured knowledge generation) produces a vector-DB-stored corpus of per-service-group memories, served back via semantic search in milliseconds.
Related¶
- patterns/precomputed-agent-context-files
- patterns/five-category-service-knowledge-schema
- patterns/weekly-refresh-cadence-for-agent-context
- patterns/specialized-agent-decomposition
- patterns/parallel-subagent-execution-for-latency
- concepts/agent-infrastructure-memory
- concepts/telemetry-as-discovery-substrate
- concepts/semantic-search-over-agent-memory
- concepts/service-group
- concepts/context-engineering
- systems/grafana-assistant