CONCEPT Cited by 2 sources
Memory compaction¶
Definition¶
Memory compaction (also called context compaction) is the lifecycle moment in an agent loop at which the context window is shortened — because the conversation is about to exceed the model's limit, or because the agent is showing context rot under growing accumulated context.
Every long-running agent harness has a compaction strategy, implicit or explicit. The interesting design question is what happens to the discarded material.
Two strategies¶
| Strategy | What happens at compaction |
|---|---|
| Discard (status quo) | Harness truncates the conversation and permanently loses everything pruned — tool outputs, side-channel facts, user preferences stated earlier. |
| Preserve-to-memory | Harness ships the about-to-be-pruned conversation to a memory service, which extracts + classifies + stores facts / events / instructions / tasks for later retrieval. |
Cloudflare Agent Memory is the canonical wiki instance of the second strategy:
"The critical moment in an agent's context lifecycle is compaction, when the harness decides to shorten context to stay within a model's limits or to avoid context rot. Today, most agents discard information permanently. Agent Memory preserves knowledge on compaction instead of losing it."
Why compaction is unavoidable¶
- Hard model limits — context windows past 1M tokens exist, but are not free: every embedded-in-window token costs inference time + per-token money + attention share.
- Soft accuracy limits — context rot means accuracy degrades before the hard limit, well short of the token ceiling.
- Runaway tool-output growth — a single log-fetch or SQL query can materialise hundreds of KB of noise that dwarfs the original user intent.
Any agent running long enough will hit one of the three.
Bulk-ingest hook¶
The Cloudflare API shape exposes compaction as an explicit handoff point:
ingest is the bulk path typically called when the harness compacts context. Tool-calls (remember / recall / forget) are the direct-model path for moment-to-moment decisions; ingest is the harness-invoked path for bulk handoff.
Sibling patterns for in-window compression¶
Compaction-to-memory is distinct from (and complementary to) in-window compression strategies:
- Tree-structured conversation memory (Project Think Persistent Sessions) — non-destructive compaction: older branches stay in SQLite + FTS, compacted summary in window, agent can
search_contextto retrieve specifics on demand. - Summarisation-in-place — naive: replace N old messages with a summary in window. Lossy, permanent.
- Write tool results to disk (Dropbox / Cursor / Claude Code shift; see sources/2026-01-28-dropbox-knowledge-graphs-mcp-dspy-dash) — reference by handle from window, fetch-on-demand if the model needs the specifics.
- Memory compaction — extract + classify + durably store in a retrieval substrate separate from the conversation log; retrieved on future turns via explicit
recall.
Seen in¶
- sources/2026-04-17-cloudflare-agents-that-remember-introducing-agent-memory — explicit naming of compaction as the critical lifecycle moment +
ingestas the bulk-hook API. - sources/2026-04-15-cloudflare-project-think-building-the-next-generation-of-ai-agents — Persistent Sessions' non-destructive compaction as the in-window-sibling design (tree + FTS, with
search_contexttool for retrieval).
Related¶
- concepts/agent-memory — the broader concept; memory compaction is the lifecycle moment memory writes into.
- concepts/agent-context-window — the scarce resource compaction protects.
- concepts/context-rot — the forcing function that makes compaction happen before the hard limit.
- concepts/context-engineering — the discipline compaction is one tactic of.
- systems/cloudflare-agent-memory — canonical preserve-to-memory realisation.
- systems/project-think — canonical non-destructive in-window sibling.
- patterns/tree-structured-conversation-memory — the in-window non-destructive variant.