PATTERN Cited by 1 source

LLM PR code review¶

LLM PR code review is the pattern of running every incoming pull request through a two-stage LLM classifier that decides whether the change is benign or malicious and outputs a structured rationale, with verdicts routed into a traditional security-incident pipeline ( SIEM → detection rule → case → incident).

The canonical wiki instance is BewAIre, Datadog's in-house system running at ≈10,000 PRs/week across internal + external repositories, and its first publicly-disclosed production catch — the 2026-02-27 hackerbot-claw campaign — is documented in sources/2026-03-09-datadog-when-an-ai-agent-came-knocking.

Shape¶

GitHub events → filter (security-relevant triggers: PRs, pushes)
  → diff extract + normalize + enrich
    → LLM classifier (two stages)
      → verdict {benign | malicious, rationale}
        → SIEM ingest → detection rule → signal → SIRT case
          → (escalate) → incident

Two stages explicitly named in Datadog's 2026-03-09 post. The specific prompts, models, evaluation methodology, and false-positive rates are not publicly disclosed.

Why two stages¶

The exact rationale is not in the Datadog post, but typical justifications for two-stage classifier designs:

Stage 1: cheap/fast filter routes out obviously-benign changes (formatting, deps bumps, small-diff whitelisted files) to save cost per PR.
Stage 2: deeper review with more context / larger model / structured output on the reduced candidate set.

This mirrors patterns/two-stage-evaluation at a different workload — the general shape of "cheap O(1) filter protecting a rich decision stage."

Production latency disclosed¶

Datadog's first production alert on 2026-02-27 fired ~33 seconds after the malicious PR opened (05:26:25 → 05:26:58). This is the only latency datum in the source.

patterns/coordinator-sub-reviewer-orchestration — the Cloudflare AI Code Review pattern is a peer review + coordinator-driven multi-agent design, not a binary malicious/benign classifier. LLM PR code review is narrower — detection, not review.
patterns/ai-review-risk-tiering — applies to gating which reviewers get involved; complementary, operates at a different abstraction.
patterns/specialized-reviewer-agents — fanning out to domain-specific reviewers; orthogonal concern.

Caveats¶

FP / FN rates not published — Datadog discloses only the two confirmed true positives. Whether analysts drown in false positives at 10K-weekly-PR scale is unanswered.
Classifier prompts not published — which limits reproducibility but matches standard practice for security classifiers (publishing the prompts gives attackers targeting guidance — see the hackerbot-claw system-prompt-knowledge observation under concepts/autonomous-attack-agent).
Model cost at 10K-PR/week volume not disclosed — commercial sensitivity.
No comparison to non-LLM baselines — e.g., regex + static-analysis rules — to quantify the LLM-specific lift.

Seen in¶

sources/2026-03-09-datadog-when-an-ai-agent-came-knocking — canonical disclosure (BewAIre + Cloud SIEM pipeline + hackerbot-claw catch).

systems/bewaire — canonical instance.
systems/github-actions — subset of the monitored event stream.
concepts/prompt-injection, concepts/github-actions-script-injection — primary attack classes this pattern is designed to catch.
concepts/autonomous-attack-agent — the adversary class driving the economic case for LLM-scale defence.
companies/datadog — operator.