CONCEPT Cited by 2 sources

Memory compaction¶

Definition¶

Memory compaction (also called context compaction) is the lifecycle moment in an agent loop at which the context window is shortened — because the conversation is about to exceed the model's limit, or because the agent is showing context rot under growing accumulated context.

Every long-running agent harness has a compaction strategy, implicit or explicit. The interesting design question is what happens to the discarded material.

Two strategies¶

Strategy	What happens at compaction
Discard (status quo)	Harness truncates the conversation and permanently loses everything pruned — tool outputs, side-channel facts, user preferences stated earlier.
Preserve-to-memory	Harness ships the about-to-be-pruned conversation to a memory service, which extracts + classifies + stores facts / events / instructions / tasks for later retrieval.

Cloudflare Agent Memory is the canonical wiki instance of the second strategy:

"The critical moment in an agent's context lifecycle is compaction, when the harness decides to shorten context to stay within a model's limits or to avoid context rot. Today, most agents discard information permanently. Agent Memory preserves knowledge on compaction instead of losing it."

— (Cloudflare, 2026-04-17)

Why compaction is unavoidable¶

Hard model limits — context windows past 1M tokens exist, but are not free: every embedded-in-window token costs inference time + per-token money + attention share.
Soft accuracy limits — context rot means accuracy degrades before the hard limit, well short of the token ceiling.
Runaway tool-output growth — a single log-fetch or SQL query can materialise hundreds of KB of noise that dwarfs the original user intent.

Any agent running long enough will hit one of the three.

Bulk-ingest hook¶

The Cloudflare API shape exposes compaction as an explicit handoff point:

// harness at compaction time
await profile.ingest(
  messagesAboutToBePruned,
  { sessionId }
);

ingest is the bulk path typically called when the harness compacts context. Tool-calls (remember / recall / forget) are the direct-model path for moment-to-moment decisions; ingest is the harness-invoked path for bulk handoff.

Sibling patterns for in-window compression¶

Compaction-to-memory is distinct from (and complementary to) in-window compression strategies:

Tree-structured conversation memory (Project Think Persistent Sessions) — non-destructive compaction: older branches stay in SQLite + FTS, compacted summary in window, agent can search_context to retrieve specifics on demand.
Summarisation-in-place — naive: replace N old messages with a summary in window. Lossy, permanent.
Write tool results to disk (Dropbox / Cursor / Claude Code shift; see sources/2026-01-28-dropbox-knowledge-graphs-mcp-dspy-dash) — reference by handle from window, fetch-on-demand if the model needs the specifics.
Memory compaction — extract + classify + durably store in a retrieval substrate separate from the conversation log; retrieved on future turns via explicit recall.

Seen in¶

sources/2026-04-17-cloudflare-agents-that-remember-introducing-agent-memory — explicit naming of compaction as the critical lifecycle moment + ingest as the bulk-hook API.
sources/2026-04-15-cloudflare-project-think-building-the-next-generation-of-ai-agents — Persistent Sessions' non-destructive compaction as the in-window-sibling design (tree + FTS, with search_context tool for retrieval).

concepts/agent-memory — the broader concept; memory compaction is the lifecycle moment memory writes into.
concepts/agent-context-window — the scarce resource compaction protects.
concepts/context-rot — the forcing function that makes compaction happen before the hard limit.
concepts/context-engineering — the discipline compaction is one tactic of.
systems/cloudflare-agent-memory — canonical preserve-to-memory realisation.
systems/project-think — canonical non-destructive in-window sibling.
patterns/tree-structured-conversation-memory — the in-window non-destructive variant.