PATTERN Cited by 1 source
Gapfill requeue for coverage¶
Intent¶
Have agents self-report coverage gaps in their own work output, then automatically re-queue those gaps as new narrow tasks for the next pass. Counteracts the model's tendency to drift toward attack classes (or domains, or hypotheses) where it has already had success and away from those where it hasn't.
Canonical articulation¶
Cloudflare's Gapfill stage of the vulnerability discovery harness (Source: sources/2026-05-18-cloudflare-project-glasswing-what-mythos-showed-us):
"Hunters flag areas they touched but didn't cover thoroughly. Those areas get re-queued for another pass. Counteracts the model's tendency to drift toward attack classes it has already had success with."
The two-step shape¶
- Self-report. Hunters add a "covered shallowly: " annotation alongside their findings — explicitly declaring places they touched but didn't fully investigate.
- Re-queue. A Gapfill stage harvests those annotations and adds them as new tasks to the Hunt queue, possibly with adjusted scope hints (different attack class, deeper trust-boundary context, etc.) so the next agent doesn't repeat the shallow pass.
Why agent self-reporting works here¶
The hunter agent often knows it didn't cover an area thoroughly — "there's a use of this function I didn't investigate; I'm focused on command injection but a buffer overflow check would also apply". The information exists inside the agent's reasoning trace; the pattern is just asking the agent to surface it as a structured output.
This is a workable form of calibrated self-assessment. The wiki has a sibling concept, concepts/model-bias-toward-finding-something, that emphasises models don't reliably abstain; the Gapfill pattern works because it asks for a different signal — coverage gaps rather than findings — which the model can report without triggering the find-something bias.
What the pattern counteracts¶
Cloudflare's verbatim: "Counteracts the model's tendency to drift toward attack classes it has already had success with."
The drift is a sibling failure mode to agent hyperfixation — but operating across task batches rather than within a single agent session. When agents have recently found buffer overflows, subsequent agents (sharing the same model bias) are more likely to look for buffer overflows again. Without an explicit gapfill mechanism, the queue concentrates around "fashionable" attack classes and under-covers others.
Why a separate stage instead of fixing in-prompt¶
A "be exhaustive" instruction in the original prompt has known failure modes:
- Model bias toward emission overrides the exhaustive-instruction. The model still drifts.
- Long prompts hit context budget. Adding "and also consider buffer overflows, race conditions, integer overflows, command injection, …" expands the prompt for every task even when most are irrelevant.
- No closure mechanism. A single-agent "be exhaustive" doesn't have a structural way to verify exhaustiveness.
A separate Gapfill stage decouples coverage from individual agent runs and makes coverage a pipeline property — the queue accumulates gap-tasks until the pipeline drains them.
Sibling patterns on the wiki¶
- patterns/multi-stage-vulnerability-discovery-harness — the harness Gapfill is one stage of.
- patterns/parallel-narrow-agents-over-exhaustive — the parallel-narrow shape Gapfill feeds new tasks into.
- patterns/narrow-scoped-agent-task — gapfill tasks inherit the same per-task narrow-scope shape.
Generalises beyond vulnerability research¶
The pattern shape — agent self-reports incompleteness; incompleteness becomes a new task — applies anywhere exhaustive coverage is desired but not feasible per-session:
- Code review: reviewer flags "I focused on the auth changes; the migration script also changed and I didn't fully review it" — flag becomes a new review task.
- Document analysis: analyser flags "I extracted product-features but the document also has a pricing section I didn't extract".
- Test coverage: agent flags "I added tests for the happy path; the error-handling branches are uncovered".
In all cases the gap is a structured output the model emits alongside its primary work, not a separate analysis pass.
Cost / requirements¶
- Per-task structured-output schema that includes a gaps field — without it, the model has no place to emit the signal.
- Prompt discipline — explicitly asking for gaps, framed in a way that the model treats as separate from emitting findings (so the find-something bias doesn't contaminate the gap report).
- Queue dedup — when many hunters report the same gap, the gap should appear in the queue once, not N times. Reuses the Dedupe stage of the harness.
- Termination criterion — the gap-loop has to halt; otherwise gaps spawn gaps spawn gaps. Cloudflare doesn't disclose theirs (per-area cap, per-batch budget, or diminishing-returns heuristic are all plausible).
Open / not disclosed¶
- Gap-task prioritisation — when gaps re-queue, do they go to the front of the queue, the back, or distributed across batches?
- Quantitative coverage uplift from Gapfill — Cloudflare doesn't measure how much additional coverage Gapfill adds vs running an extra round of fresh tasks.
- Termination — the post doesn't disclose how the loop exits.
Seen in¶
- sources/2026-05-18-cloudflare-project-glasswing-what-mythos-showed-us — canonical wiki articulation; the Gapfill stage of Cloudflare's harness.
Related¶
- patterns/multi-stage-vulnerability-discovery-harness — the harness this is a stage of.
- patterns/parallel-narrow-agents-over-exhaustive — the parallelism pattern Gapfill feeds.
- patterns/narrow-scoped-agent-task — the shape gap- tasks inherit.
- concepts/model-bias-toward-finding-something — the failure mode gapfill compensates for in a calibration- workable way.
- concepts/agent-hyperfixation-failure-mode — sibling drift failure mode at single-session scale.
- systems/cloudflare-vulnerability-discovery-harness — the canonical implementation.