Skip to content

CONCEPT Cited by 1 source

Grep loop

Definition

The grep loop is an agent failure mode where a documentation set that exceeds the agent's context window forces the agent into iterative grep-style keyword search across the corpus rather than reading the relevant section directly.

Named after Unix grep because the agent's behaviour looks identical to a human impatiently grep-ing through unfamiliar source: guess a keyword, see the hits, refine the keyword, try again, eventually find the answer — or give up and answer imprecisely.

Cloudflare's framing (2026-04-17)

Three failure-mode properties, each compounding:

  1. Cannot read the whole file. llms.txt too big; agent grep-searches for keywords.
  2. Narrowed context → lower accuracy. "When an agent relies on iterative searching rather than reading the full file, it loses the broader context of the documentation at hand. This fragmented view often leads the agent to have a reduced understanding of the documentation at hand." — a missed concept in an adjacent section never surfaces.
  3. Latency and token bloat. Each iteration burns "new thinking tokens" plus another search round-trip; total user-visible latency adds up; the total cost exceeds what a single doc-read would have cost.

Why it matters specifically for agents

For a human developer, grep-ing through a large codebase is fine — a human knows the surrounding context from prior experience. For an agent with a fresh context window and no out-of-session memory, grep-ing is a worst-case form of reading: only snippets that match the query enter context; the surrounding mental model doesn't.

Canonical structural fix

Split the corpus into context-window-sized chunks. Cloudflare's 2026-04-17 canonical answer: one llms.txt per top-level directory, root file points to each — captured as patterns/split-llms-txt-per-subdirectory. Each per-directory llms.txt fits in a single context window; the agent reads the index once, identifies the exact product doc it needs, and fetches it via markdown content negotiation in a single, linear path — no grep loop.

Complementary practices:

  • Remove directory-listing pages (token cost, no semantic content).
  • Ensure every index entry has rich titles + descriptions — the agent's steering wheel.

Benchmark evidence

Cloudflare's Kimi-k2.5/OpenCode benchmark against other large technical documentation sites' llms.txt:

  • 31 % fewer tokens.
  • 66 % faster to correct answer.

Both framed as the result of avoiding the grep loop.

  • Context-window exhaustion proper — when the total in-context content exceeds the window's capacity. The grep loop is the behaviour when exhaustion forces truncation/paging.
  • Context engineering — the discipline of choosing what enters the context window; the grep loop is a symptom of poor context engineering on the documentation-author side.

Seen in

Last updated · 200 distilled / 1,178 read