CONCEPT Cited by 1 source
Signal-to-noise in AI vulnerability triage¶
Definition¶
Signal-to-noise in AI vulnerability triage is the operational problem that LLM-driven vulnerability scanners produce many more hedged, speculative findings than they do real bugs, and the per-finding cost of dismissing the hedged ones — "every speculative finding spends human attention and tokens to dismiss, and that cost compounds across thousands of findings" — dominates the triage pipeline unless explicitly controlled.
Cloudflare's verbatim canonical articulation (Source: sources/2026-05-18-cloudflare-project-glasswing-what-mythos-showed-us):
"One of the hardest parts of triaging security vulnerabilities is deciding which bugs are real, which are exploitable, and which need fixing now. This was a hard problem even in the pre-AI world. AI vulnerability scanners and AI-generated code have made it worse, and at Cloudflare we've built multiple post-validation stages to deal with it."
The two dominant noise factors¶
Cloudflare names the two axes that drive the noise rate:
- Programming language. "C and C++ give you direct memory control and, with it, bug classes — buffer overflows, out-of-bounds reads and writes — that memory-safe languages like Rust eliminate at compile time. We saw consistently more false positives from projects written in memory-unsafe languages."
Memory-unsafe languages produce more plausible-looking bug surfaces for the model to flag — bounds checks, pointer arithmetic, lifetime decisions — even when the surrounding code is correct. The model can produce a credible-sounding hedge against any of them.
- Model bias toward finding something. "A good human researcher tells you what they found and how confident they are. Models don't. Ask a model to find bugs, and it will find them, whether the code has any or not. Findings come back hedged with 'possibly,' 'potentially,' 'could in theory,' and the hedged findings vastly outnumber the solid ones."
The two compound: a model with the find-something bias applied to a memory-unsafe codebase produces the worst-case noise distribution in the triage queue.
Why noise is structurally costly, not just inconvenient¶
Cloudflare names the structural framing:
"That's a reasonable bias for an exploratory tool. It's a ruinous one for a triage queue, where every speculative finding spends human attention and tokens to dismiss, and that cost compounds across thousands of findings."
Three properties make the cost compound:
- Per-finding human dismissal cost is non-zero. A hedged finding still requires reading, reasoning, and marking it false-positive.
- Per-finding token cost is non-zero. If dismissal goes through another agent (review or judge), each dismissal costs API tokens.
- Volume scales with codebase × attack-class × hunter- count. At fleet scale (Cloudflare scans "more than fifty" repos with ~50 concurrent hunters per scan run), noise volume reaches thousands of findings without intervention.
The wiki sibling concept is concepts/false-positive-management — Figma's verbatim "manage false positives or they'll manage you". The AI-vuln-research instance is the same problem with a sharper substrate: the finding-emitter is a probabilistic model with bias toward emission.
Three levers Cloudflare uses to control it¶
The harness embeds noise control as multiple pipeline stages:
- Proof of exploitability attached to findings — "a finding that arrives with a PoC is a finding you can act on, and it means far less time spent asking 'is this even real?'" The proof reduces the per-finding dismissal cost from human reading + reasoning to "observe the PoC reproduces or doesn't".
- Adversarial validator agent with no ability to emit new findings between hunter and queue. "Putting two agents in deliberate disagreement is way more effective than just telling one agent to be careful." Forbidding the validator from emitting findings is load-bearing — without it, the validator inflates queue size.
- Dedup stage in the pipeline — "Findings that share the same root cause collapse into a single record. Variant analysis is a feature, not a way to inflate the queue with duplicates." Reduces noise volume by collapsing variant-of-the-same-bug findings.
These three together convert the noise problem from "unmanageable hedge stream" into "queue of validated, deduplicated, PoC-attached findings".
Tuning trade-off Cloudflare names¶
The post explicitly chooses to over-report at the input and rely on downstream stages to filter:
"Our harnesses are deliberately tuned to over-report, so we see more (and miss less), which comes with a lot more noise. But at triage time, Mythos Preview's output has noticeably higher quality: fewer hedged findings, clearer reproduction steps, and less work to reach a fix-or-dismiss decision."
The shape: lossy filters early are dangerous (you miss real bugs), so over-report and filter late with proof, adversarial review, and dedup. The downstream-filter quality is what makes the over-reporting strategy viable.
Why this is worse than pre-AI triage, in two ways¶
Cloudflare's verbatim is direct: "AI vulnerability scanners and AI-generated code have made it worse". Two compounding mechanisms:
- AI-generated code adds new bug substrates that AI-vuln scanners then find — a feedback loop where the same model class generates and audits code, with each step producing artefacts the other side flags.
- AI-vuln scanners scale finding-emission rate to levels human triage cannot keep up with, by orders of magnitude. The pre-AI-world's slow hand-research throughput was a natural circuit breaker; AI removes it.
Open / not disclosed¶
- Quantitative noise rate — "vastly outnumber", "a meaningful fraction" — no precise ratios.
- Per-language false-positive multiplier — Cloudflare states "consistently more" false positives on memory-unsafe codebases without quantifying the multiplier.
- Cost of running the validator stage as a fraction of total harness cost — not disclosed; only the qualitative "catches a meaningful fraction" claim.
Sibling concepts¶
- concepts/false-positive-management — the detection-system sibling. AI-vuln triage signal-to-noise is the vulnerability-research instance.
- concepts/model-bias-toward-finding-something — the upstream failure mode that drives the noise. Triage signal-to-noise is what you get when the bias is left unmanaged.
- concepts/proof-of-exploitability — the primary in-finding lever against the noise.
Seen in¶
- sources/2026-05-18-cloudflare-project-glasswing-what-mythos-showed-us — canonical wiki disclosure; the multi-stage harness is the architectural answer to the signal-to-noise problem.
Related¶
- concepts/false-positive-management, concepts/model-bias-toward-finding-something, concepts/proof-of-exploitability, concepts/exploit-chain-construction, concepts/memory-safety — the concept-level scaffolding.
- patterns/multi-stage-vulnerability-discovery-harness, patterns/adversarial-review-subagent — the pattern-level answers.
- systems/cloudflare-vulnerability-discovery-harness — the system-level instance.