CONCEPT Cited by 1 source
Agent infrastructure memory¶
Definition¶
Agent infrastructure memory is a background-populated, long-lived, queryable data store describing the user's runtime infrastructure — services, metric names, deployment topology, dependencies, log structure — built for AI-agent consumption and kept current on a recurring refresh cadence.
It sits alongside (but is architecturally distinct from) the agent context window: the context window is the ephemeral per-turn working set; the infrastructure memory is the persistent, retrievable substrate the agent draws relevant context from at each question.
Why "infrastructure" specifically¶
The generalisation covers any domain-specific long-lived memory for agents ( precomputed context files over code, conversation-thread summaries, customer-profile records). Infrastructure memory names the subset where the source of truth is live observability telemetry — metrics, logs, traces, K8s object state — and the captured knowledge is the shape of the running system, not the shape of code or user interactions.
Canonical Grafana framing (Source: sources/2026-05-01-grafana-how-grafana-assistant-learns-your-infrastructure-before-you-even-ask):
"A developer investigating an issue in their service can ask about upstream dependencies and get accurate answers, even if they've never looked at those systems before."
The target outcome is context-parity between experienced and new responders — the agent pre-loads what a senior SRE would already know about the upstream services, so a junior developer on their first incident can query that knowledge in natural language.
Structural properties¶
- Background-populated, not on-demand. The memory is built before any user asks. The first agent question doesn't pay the cost of discovering the infrastructure — that work was amortised across weekly refresh cycles.
- Typed schema, not freeform. Grafana's memory uses a fixed five-category schema per service group: identity, metrics, topology, dependencies, log structure. A typed schema is what makes the memory composable: the agent can reliably ask "what metrics exist for service X?" because every memory has the same Key-metrics field.
- Semantic-search-addressable. Retrieval by natural
language question → vector similarity over embedded memory
chunks (concepts/semantic-search-over-agent-memory).
The agent doesn't have to know the exact service name the
user said; "checkout API" should resolve to the
checkout-apiservice group's memory. - Freshness bounded by refresh cadence. Weekly in the Grafana instance. Stale memory is worse than no memory (concepts/context-file-freshness): an agent confidently using an old metric name that was renamed last Tuesday is worse than admitting it doesn't know.
- ACL-aware at retrieval time. Memories inherit the access-control boundary of the data sources they were generated from (concepts/acl-propagated-agent-memory).
Relationship to other memory forms¶
| Memory form | Substrate | Example | Wiki page |
|---|---|---|---|
| Code / module memory | Source tree + config files | Meta per-module markdown | systems/meta-ai-precompute-engine |
| Conversation memory | Chat thread history | Cloudflare Agents-that-remember | (on companies/cloudflare) |
| Infrastructure memory | Live telemetry | Grafana Assistant | systems/grafana-assistant |
All three share the same structural pattern — a background extraction pipeline that precomputes structured context for AI agents — but differ in substrate and in the shape of the thing-to-remember. Infrastructure memory's distinctive property is that its source of truth (telemetry) is already continuously updated by the running system, so refresh is bounded by "when did the infra shape last change enough to matter?" rather than by "when did a human last update the docs?"
Failure modes¶
- Telemetry-gap blind spots. If a service emits no metrics, it does not appear in the memory. The zero-config commitment ("If you have metrics, you get this infrastructure memory capability") is also a gating condition — un-instrumented services are invisible.
- Stale after unplanned change. If a service is redeployed with a renamed metric between refresh cycles, the memory is temporarily wrong. The manual-trigger escape hatch mitigates this at cost of user awareness.
- Swarm-inference cost scales with stack size. Weekly re-extraction of every service group on a large stack is non-trivial LLM compute (not disclosed in the Grafana post).
- Dependency graph is as good as the trace coverage. Dependencies inferred from traces are limited to paths the traces actually cover; background sync jobs, periodic batch dependencies, or sampled-out hot paths may be missing.
- Cross-stack / cross-account memory scoping is undisclosed in the Grafana instance — whether the memory crosses stacks, accounts, or K8s clusters is a gap.
Seen in¶
- sources/2026-05-01-grafana-how-grafana-assistant-learns-your-infrastructure-before-you-even-ask — canonical first wiki instance of agent infrastructure memory, at the Grafana Cloud observability-stack altitude. Names the four-stage swarm-of-agents discovery pipeline, the five-category service-group knowledge schema, the vector-DB-backed memory store, semantic-search retrieval in milliseconds, weekly automatic refresh cadence with manual trigger, and data-source-inherited ACL propagation as the load-bearing architectural properties.
Related¶
- concepts/telemetry-as-discovery-substrate
- concepts/semantic-search-over-agent-memory
- concepts/acl-propagated-agent-memory
- concepts/service-group
- concepts/context-engineering
- concepts/context-file-freshness
- concepts/tribal-knowledge
- concepts/agent-context-window
- patterns/precomputed-agent-context-files
- patterns/swarm-of-discovery-agents-for-context-prebuild
- patterns/five-category-service-knowledge-schema
- patterns/weekly-refresh-cadence-for-agent-context
- systems/grafana-assistant
- systems/meta-ai-precompute-engine