Skip to content

CONCEPT Cited by 1 source

Context exhaustion

The failure mode where an LLM agent's context window fills up and the model begins cannibalising its own memory — forgetting bugs it spent hours tracking down, losing hypotheses mid-investigation, or producing degraded output quality.

Definition

Context exhaustion occurs when: 1. The volume of code, findings, or reasoning accumulated within a single agent session exceeds the effective window. 2. The model begins dropping earlier context to make room for new tokens. 3. Quality degrades non-linearly — often without the model signalling the loss.

This is distinct from concepts/context-rot (quality degradation over time within a long context) — context exhaustion is the hard-ceiling failure where information is structurally lost.

Cloudflare's experience: "An hour in, the context window fills up and the model will cannibalize its own memory, instantly forgetting the bugs it spent all morning tracking down" (Source: sources/2026-06-18-cloudflare-build-your-own-vulnerability-harness).

Mitigations

  • Externalise stateconcepts/stateless-agent-compute: all findings and intermediate state in external storage
  • Hyper-focused agents — keep each agent's job narrow enough that context usage stays below 25% of the total window
  • Stage pipeline — split work into independent stages with clear output contracts

Seen in

Last updated · 542 distilled / 1,571 read