Skip to content

PATTERN Cited by 1 source

Token-budget pagination

Token-budget pagination cuts a tool's response after a fixed token count (not a record count) and returns a cursor for continuation. It is the record-size-variance-robust variant of classic API pagination, applied to tools whose consumer is an LLM agent with a hard token budget (Source: sources/2026-03-04-datadog-mcp-server-agent-tools).

Motivation

Traditional APIs paginate by record count (?limit=N), which is bounded in records but unbounded in bytes/tokens when records vary in size. Datadog's concrete forcing case: log messages range from ~100 bytes to ~1 MB. An agent requesting "a reasonable number of logs" that happens to land a handful of 1 MB records has blown its context window before it can reason over the rest of the page.

Shape

  • Server counts tokens (with the same tokenizer it expects the consuming LLM to use; OpenAI tokenizer is the commonly-cited reference) while serializing the response.
  • Stops serializing at token_budget, flushes what it has.
  • Returns a continuation cursor in the response so the agent (or its harness) can request the next page.
  • The page is variable in record count; this is the whole point.

Why this is not just "paginate smaller"

A fixed tiny page size (e.g. 10 records) also caps the worst case, but it wastes budget on the common case (small records) and forces the agent to make many more tool calls, inflating latency and tool-call-accuracy risk โ€” the other scarce resource (patterns/tool-surface-minimization discusses). Token-budget pagination adapts page size to record-size distribution at the server, which is the party that has already-serialized bytes in hand.

Decaying constraint?

If client-side "write long tool results to disk" becomes a MCP spec feature (Cursor dynamic context discovery; Claude Code), the per-call token budget softens โ€” the agent can reason over a file handle instead of inlined content. Until that ships broadly, token-budget pagination is the default shape for any MCP tool whose rows are size-variable.

Seen in

Last updated ยท 200 distilled / 1,178 read