SYSTEM Cited by 1 source

Grafana Assistant¶

Grafana Assistant is Grafana Labs' AI assistant surface inside Grafana Cloud (and, from GrafanaCON 2026, self-managed Grafana Enterprise / OSS). It embeds a conversational natural-language interface directly into the Grafana UI — backed by per-customer infrastructure knowledge assembled from the stack's own telemetry — so users can ask "why did checkout-api get slower this week?", "what services does payments-backend depend on?", or "is this alert noisy, can we fix it?" against their real environment instead of a generic documentation-trained LLM.

Canonical framing¶

The Assistant is positioned not as a separate chatbot but as an observability-integrated agent: it knows the customer's metrics / logs / traces catalogue, can write PromQL / LogQL / TraceQL against real names (not placeholder templates), and can deep-link back into the Grafana UI when a human needs to look at a specific chart or alert rule.

Infrastructure memory subsystem (2026-05-01 disclosure)¶

The 2026-05-01 launch post (sources/2026-05-01-grafana-how-grafana-assistant-learns-your-infrastructure-before-you-even-ask) is the canonical wiki home for Assistant's infrastructure memory — a background-populated, long-lived, queryable knowledge store describing the customer's runtime infrastructure, built by a "swarm of AI agents" and served back via semantic search in milliseconds.

Four-stage discovery pipeline¶

Data source discovery. "The system identifies all connected Prometheus, Loki, and Tempo data sources in your Grafana Cloud stack."
Metrics scans. "Agents query your Prometheus data sources in parallel to find services, deployments, and infrastructure components."
Enrichments via logs and traces. "Loki and Tempo data sources get correlated with their corresponding metrics, adding context about log formats, trace structures, and service dependencies."
Structured knowledge generation. "For each discovered service group, agents produce documentation covering five areas" — the five-category service-knowledge schema.

The pipeline shape is canonicalised as patterns/swarm-of-discovery-agents-for-context-prebuild — many parallel agents walking a telemetry surface to produce a searchable corpus of structured service knowledge ahead of any user question.

Five-category knowledge schema (per service group)¶

Each service group is summarised into the same five axes:

Identity and purpose — what the service is, namespace + cluster + technology stack.
Key metrics — actual PromQL metric names + labels (not generic placeholder names); including golden signals (latency, error rate, traffic, saturation) rendered in the customer's own naming.
Deployment topology — Kubernetes resources, replica counts, scaling configuration, container details.
Dependencies — upstream + downstream service connections, DB / cache / queue relationships, external integrations.
Log structure — available log labels + values, detected formats (JSON, logfmt, unstructured), common patterns, extracted field names.

Memory storage and retrieval¶

"This knowledge is stored as searchable chunks in a vector database, so when you or the assistant need information about a specific service, it can be retrieved in milliseconds through semantic search."

The memory tier is therefore not the Grafana Cloud entity graph (which is a structured asset inventory) and not the raw Prometheus / Loki / Tempo query surface — it's an embedding-keyed document store of precomputed service summaries. Canonical retrieval primitive is concepts/semantic-search-over-agent-memory over vector similarity.

Weekly refresh (with manual trigger escape hatch)¶

"The whole process refreshes automatically on a weekly cadence, so your assistant's understanding of your infrastructure stays current as your environment evolves." Users can also trigger a manual scan. Canonical pattern: patterns/weekly-refresh-cadence-for-agent-context.

Access-control model: data-source-inherited¶

"Assistant also respects your organization's access controls. Each memory is linked to the data sources used to generate it, so users only see knowledge derived from data sources they have permission to access." Rather than copying the data source's ACL onto each generated memory chunk, Grafana tags each memory with its generating data source and filters by the user's data-source access list at retrieval time. Canonical concept: concepts/acl-propagated-agent-memory.

Zero-configuration commitment¶

The post repeats the zero-config framing four times: "zero configuration", "This isn't a feature you configure, enable, or maintain", "no setup steps, no configuration files, no scheduled jobs to manage", "If you have metrics, you get this infrastructure memory capability." This is the headline UX commitment — any requirement for a service catalogue, annotation, or manual registration would defeat the pitch. The bet: customer telemetry is a sufficient discovery input, canonicalised as concepts/telemetry-as-discovery-substrate.

User-facing surface¶

Infrastructure memory browse. "You can review what the assistant has learned by navigating to the Assistant settings and browsing the discovered service groups."
Manual refresh. Trigger a manual scan ahead of the next automatic cycle.
Conversational query. Natural-language question → the Assistant retrieves relevant service-group memories via semantic search → grounded answer referencing the customer's real metric names / log labels / dependencies.

Why it's on the wiki¶

Grafana Assistant joins a short list of agent-infrastructure-memory instances across the engineering-blog corpus:

Meta's AI precompute engine (2026-04-06) — per-module markdown files describing Python data-pipeline configs.
Cloudflare's "Agents that remember" (2026-03-05, summarised in companies/cloudflare) — agent context extracted from conversation threads.
Grafana Assistant (this page) — per-service-group infrastructure summaries extracted from metrics / logs / traces.

Each instance applies the same load-bearing pattern — a background extraction pipeline that precomputes structured context for AI agents — at a different substrate (code, conversations, telemetry). The Grafana instance is the first wiki-canonical example at the observability-stack substrate.

systems/grafana-cloud — the managed platform Assistant ships inside; the 2026-05-01 disclosure targets Grafana Cloud customers specifically.
systems/grafana — the dashboarding UI the Assistant deep-links into (and is embedded within).
systems/prometheus + systems/loki + Tempo — the three telemetry data sources the swarm walks for discovery.
gcx — the Grafana Cloud agent-ergonomic CLI; it and the Assistant are peer surfaces on the same Grafana Cloud control plane (gcx for terminal-hosted agents, the Assistant for UI-embedded chat).

Seen in¶

sources/2026-05-01-grafana-how-grafana-assistant-learns-your-infrastructure-before-you-even-ask — canonical first-ingest of Grafana Assistant on the wiki. Introduces the infrastructure-memory subsystem, the four-stage swarm-of-agents discovery pipeline, the five-category service-group knowledge schema, the vector-DB-backed memory store, semantic-search retrieval in milliseconds, weekly automatic refresh cadence with manual trigger, data-source-inherited ACL propagation, and the zero-configuration UX commitment.

Caveats¶

No disclosure of the underlying LLM(s), vector DB, or embedding model powering the Assistant.
No operational numbers (total memories per stack, compute cost per weekly refresh, LLM token spend, storage footprint, query QPS, p99 retrieval latency beyond "milliseconds").
Exact service-group detection heuristic undisclosed (presumably inferred from Prometheus label patterns or K8s workload groupings).
Weekly-refresh partial-failure handling undisclosed.
Whether infrastructure memory extends to self-managed Grafana Enterprise / OSS (per the 2026-04-22 Assistant-everywhere announcement) is not confirmed in this post.