Skip to content

PATTERN Cited by 1 source

Cross-repo tracer fan-out

Intent

When a confirmed bug lives in a shared library consumed by multiple downstream repositories, decide whether the bug is reachable from outside separately for each consumer repo — by fanning out one tracer agent per consumer repository, each using a cross-repo symbol index, each returning a reachable/unreachable verdict. This converts a single "is this exploitable?" question into N parallel "is it exploitable from this consumer?" questions, each of which is sharply scoped.

Canonical articulation

Cloudflare on the Trace stage of its vulnerability discovery harness (Source: sources/2026-05-18-cloudflare-project-glasswing-what-mythos-showed-us):

"For each confirmed finding in a shared library, a tracer agent fans out (one instance per consumer repository), uses a cross-repo symbol index, and decides whether attacker-controlled input actually reaches the bug from outside the system. Turns 'there is a flaw' into 'there is a reachable vulnerability.' This is the stage that matters most."

Why per-consumer fan-out, not one cross-cutting analysis

Three properties motivate the fan-out shape:

  1. Reachability is consumer-specific. A function in libfoo may be unreachable from service-A (which exposes only a subset of the API to external callers) but trivially reachable from service-B (which proxies external requests through the affected entry point). A single cross-cutting analysis would have to hold all N consumer call graphs in one context — exactly the coverage-failure shape this pattern's existence is designed to avoid.
  2. Per-consumer scope hits the narrow-scope sweet spot. The tracer agent's prompt is essentially "in this specific repo, can attacker-controlled input reach this specific function via this specific call pattern?" — narrow enough to give the model real leverage.
  3. Parallel fan-out scales with the consumer count. A shared library with 50 consumers spawns 50 tracer agent instances; coverage scales with the consumer fan-out factor.

What the tracer agent needs

Input Source
Finding (function, vulnerability class, PoC) Hunt → Validate stage output
Consumer repo code Per-consumer-repo agent context
Cross-repo symbol index Pre-computed; resolves libfoo::vulnerable_fn references in consumer source
Trust-boundary documentation Recon stage output for the consumer repo

The cross-repo symbol index is the load-bearing capability — without it, the tracer agent can't resolve which consumer code paths actually call the affected function.

Output shape

For each (finding, consumer-repo) pair:

  • Reachable — attacker-controlled input has a path to the bug. The finding upgrades to a vulnerability for that consumer.
  • Unreachable — no attacker-controlled path. The bug is still a bug to fix at upstream-library cleanup priority, but not a vulnerability for this consumer.

Cloudflare names the consequence: "Reachable traces become new hunt tasks in the consumer repositories where the bug is actually exposed" — the Feedback stage closes the loop back into the Hunt queue, scoped to the consumer's threat surface.

Why this is the stage that matters most

Cloudflare's verbatim ranking:

"Turns 'there is a flaw' into 'there is a reachable vulnerability.' This is the stage that matters most."

The Hunt stage produces findings; the Trace stage produces vulnerabilities. The two are not synonyms. A finding without an attacker-reachable path is an internal-quality issue. The Trace stage is what tells security teams which findings to act on now vs. file as cleanup.

Sibling patterns on the wiki

Cost / requirements

  • Cross-repo symbol index — a precomputed index mapping symbol references across repos. Without it, tracer agents re-do symbol resolution per task, burning tokens.
  • Consumer-repo enumeration — knowing which repos consume the shared library. Typically a build-system artifact (Cargo.lock / package.json / Bazel BUILD).
  • Per-consumer scope hints — trust boundaries and entry points per consumer, ideally produced by per-consumer Recon. Without these, tracer agents wander on each consumer like a generic coding agent on the original repo.

Open / not disclosed

  • Cross-repo symbol index implementation — Cloudflare doesn't name the underlying tool (Sourcegraph, glean, custom).
  • What "attacker-controlled input" means for non-public services — Cloudflare's edge-platform threat model has a clear "outside the system"; on internal services the trust boundary is more nuanced.
  • Per-tracer false-positive rate — given the model's bias toward emission, tracers may also assert reachability that doesn't hold. Cloudflare doesn't disclose how Trace output is validated.

Seen in

Last updated · 542 distilled / 1,571 read