CONCEPT Cited by 1 source

Online context summarisation¶

Definition¶

Online context summarisation is the pattern of compacting state round-by-round as an investigation progresses — as opposed to offline summarisation (done retrospectively after the full transcript exists) or no summarisation (raw history carried forward). The summarisation happens incrementally on each round, producing curated artifacts that replace raw message history.

Named by Slack's Security Engineering team in the Spear architecture (Source: sources/2026-04-13-slack-managing-context-in-long-run-agentic-applications):

"Collectively, these channels provide a means of online context summarisation, negating the need for extensive message histories."

The "online" qualifier is load-bearing: summarisation happens concurrent with investigation progress, not as a post-hoc pass.

Why online, not offline¶

Three structural reasons:

1. Agent invocations need summaries now¶

The Director needs to read the current timeline to decide its next question. The Expert needs to read the current Journal to understand prior context. These consumers can't wait for an offline summariser to run after the investigation concludes — they need summarised state as input to the next round.

2. State grows unboundedly; carry-forward scales linearly¶

An offline pass over the full transcript assumes the full transcript fits somewhere. For Slack's "hundreds of inference requests and megabytes of output" scenarios, even offline summarisation of the full history is expensive. Online summarisation keeps each round's summary small (only the new findings since last round) and compounds into a running summary over the full investigation.

3. Quality of the summary improves if done incrementally¶

Summarising 100 rounds' worth of history in one offline pass is a much harder task than summarising 1 round's findings incrementally into an existing running summary. Online summarisation splits the work into small, tractable chunks.

Slack's three-artifact online summary¶

In Spear, the "online summarisation" is distributed across three artifacts, each produced by a different agent on each round (Source: sources/2026-04-13-slack-managing-context-in-long-run-agentic-applications):

1. Director's Journal (grows monotonically)¶

The Director appends new entries (decisions, observations, findings, questions, actions, hypotheses) each round. The Journal grows, but each entry is compact and typed — unlike a raw message log, the Journal doesn't include full tool arguments, tool results, or LLM reasoning traces.

2. Critic's Review (replaced each round)¶

The Critic's Review is produced fresh each round from the newest Expert findings. The previous round's Review is not carried forward — it has already been consumed (by the Director for its decision) and merged into the running Timeline. Each Review is a point-in-time summary of the most recent findings, not accumulated findings.

3. Critic's Timeline (rewritten each round)¶

The Timeline is the running chronological narrative. On each round, the Critic rebuilds it from "The most recent Review, the previous Critic's Timeline, the Director's Journal." The Timeline is the canonical online summary: the previous Timeline becomes input for the next Timeline, and the new Review + Journal are merged in.

This is a fold in the functional-programming sense: new_timeline = merge(prev_timeline, new_review, journal).

The fold pattern¶

The Timeline update step is architecturally interesting because it's a fold over investigation rounds:

timeline_0 = empty
timeline_1 = merge(timeline_0, review_1, journal_1)
timeline_2 = merge(timeline_1, review_2, journal_2)
...
timeline_N = merge(timeline_{N-1}, review_N, journal_N)

Each merge is a bounded-work operation (the Timeline is capped at a reasonable length — events + top-3 gaps + score), so the fold's total work is O(N) over the investigation's N rounds, not O(N²) as offline-summarise-full-transcript would be.

The summarisation quality-vs-retention trade-off¶

Online summarisation is lossy — the moment round 3's findings are merged into the Timeline, the raw round-3 reasoning is gone from the agent's working state. Two mitigations co-occur:

Event stream persistence. Slack's Hub/Worker/Dashboard architecture persists all tool calls, model invocations, and system events (see the 2025-12-01 first Spear post for details). Raw state is not lost from the system, just not carried forward into agent prompts. A human supervisor can replay the event stream.
Typed entries in the Journal. The Journal's typed entries survive in-prompt, so the structural reasoning (decisions, hypotheses) is preserved across rounds even as raw tool-call traces are dropped.

Contrasts¶

vs. offline summarisation — offline runs over the full transcript after the task completes. Used for post-investigation reports. Online is used for in-loop continuity.
vs. no summarisation — carry raw history forward. Works for short tasks, scales poorly (see concepts/no-message-history-carry-forward).
vs. sliding-window compaction — truncate oldest N messages when the window fills. Lossy in a different way: loses the early state rather than compacting it.
vs. summarisation middleware — some frameworks trigger auto-summarisation when the context window approaches full. This is online summarisation triggered by token pressure; Slack's pattern is always-on online summarisation regardless of token pressure.
vs. memory systems (e.g. MemGPT) — memory systems externalise summaries to a retrieval layer. Slack's pattern carries the online summary in-prompt as structured state, not in an external retrieval layer.

When online summarisation is the right pattern¶

Long-running loops with natural summarisation artifacts. Timelines, journals, scored findings — any state that can be updated incrementally.
Multi-agent systems where different agents need different summary views. Online summarisation naturally separates into per-consumer artifacts.
Supervisory / human-in-the-loop contexts. Online summaries are readable by humans at any point; offline summaries only exist post-task.

When it's the wrong pattern¶

Tasks where the full transcript is the output. (Transcription, translation, code-review comment threads) — the raw stream is the deliverable.
Tasks with short, predictable context. Single-turn chatbots don't benefit from the overhead of summary artifact maintenance.
Tasks requiring lossless replay. If every intermediate reasoning step is needed for audit (legal, medical, regulated contexts), keep the event stream as the primary artifact and don't rely on the compacted summary for later reconstruction.

Seen in¶

systems/slack-spear — canonical first wiki instance. Three artifacts (Journal + Review + Timeline) updated each round; Timeline is a fold over (prev_timeline, review, journal). Slack's verbatim claim: "Collectively, these channels provide a means of online context summarisation, negating the need for extensive message histories." (Source: sources/2026-04-13-slack-managing-context-in-long-run-agentic-applications)