Skip to content

PATTERN Cited by 1 source

Pre-human agent review

Intent

Put agent-driven review in front of human review. By the time a human sees the PR, the obvious issues have been caught by automated reviewers — the human is reviewing architecture and design intent, not nitpicking. This unblocks human-review throughput, which is the team-level bottleneck on agent-driven development.

Canonical articulation — Atlassian Fireworks, 2026-04-24:

"If you're blocked on human review, your throughput is gated by the slowest reviewer. Teams need to embrace AI-assisted reviews and shift their attention to the high level: architecture, design intent, risk, rather than nitpicking details. The agents can handle the details."

"For bigger, scarier PRs: spin up an independent agent to review before a human even looks at it." (Source: sources/2026-04-24-atlassian-rovo-dev-driven-development)

Shape

      Main agent writes PR
   ┌─────────────────────────────┐
   │ TIER 1 — Adversarial agent  │  ← catches bugs / design flaws
   │    (pre-human)              │     main agent missed
   └─────────────────────────────┘
   ┌─────────────────────────────┐
   │ TIER 2 — CI quality gate    │  ← lint / vet / tests / Helm
   │    (automated)              │
   └─────────────────────────────┘
   ┌─────────────────────────────┐
   │ TIER 3 — Human review       │  ← architecture / design intent
   │    (architectural)          │     risk
   └─────────────────────────────┘
              Merge

The key idea: human review is the last tier, not the only tier.

Why this is a throughput pattern, not just a quality pattern

Agent-driven development can produce PRs faster than a team of humans can review them. If human review is the only gate, the pipeline bottlenecks on the slowest human reviewer — and the agent's throughput gain is lost to queueing. The source post names this explicitly: "your throughput is gated by the slowest reviewer."

Moving detail-level review (bug-hunting, nitpicks, style) to automated tiers leaves the human reviewer with a smaller per-PR cognitive load, which lets one human reviewer handle more PRs per day. The bottleneck shifts from "reviewing all PRs" to "architectural judgment on PRs that need it."

Tier allocation — what each tier does

Tier Reviewer Catches Calibrated for
1 Adversarial sub-agent Bugs the main agent missed; weak invariants; hallucinated tests Detail-level correctness
2 CI quality gate Lint / static analysis / test suite / deployment-config validity Repeatable, mechanical checks
3 Human reviewer Architecture, design intent, risk, alignment with team direction Judgement-level concerns

A well-partitioned review stack has no overlap: tier 1 doesn't re-run the CI checks tier 2 covers, and tier 3 doesn't re-do the nitpicks tier 1 already caught.

What "bigger, scarier PRs" gets

For high-risk PRs, the Fireworks team fires a dedicated pre-human review agent, not just the default adversarial sub-agent. Two reasons the escalation helps:

  1. Model quality floor. Bigger PRs justify a more capable (more expensive) reviewer model that wouldn't be cost- effective on every PR.
  2. Domain specialisation. A pre-human reviewer for a security-sensitive PR can be prompted specifically for security-style review, distinct from the general adversarial reviewer. See patterns/specialized-reviewer-agents for the domain-axis variant.

Process commitment — "main deploys to dev without PRGB"

The source post's organisational commitment that makes this pattern work:

"We have made great progress in our team having main deploy to dev without PRGB (Peer Review / Green Build). This lets us ship to internal test cases and ourselves faster. We can't afford waiting hours for a human PR, especially in a multi-timezone world."

The "main deploy to dev without PRGB" commitment says: the agent-driven review tiers plus the canary-to-prod guardrails are sufficient to gate dev deploys; human PRGB is not required to ship to dev. Production still gets canary deploys across multiple clusters (see patterns/rbac-jit-as-agent-safety-net).

This is a team-level choice — it doesn't work unless the team trusts the lower tiers.

When it fits

  • High-volume agent-driven PR stream. The throughput problem the pattern solves only exists if you have the throughput.
  • Team willing to trust lower tiers. Pre-human review tiers only help if the team treats them as real review, not noise.
  • Multi-timezone / 24h development cadence. Queue-for-human- review latency hurts more when timezones are involved.

When it doesn't fit

  • Low-volume / high-stakes domain. If every PR is critical and the team can afford thorough human review of every one, pre-human tiers add complexity without fixing a bottleneck.
  • Regulated environments requiring human sign-off. Some compliance regimes require human review as a primary gate, not a secondary one.

Composition

Seen in

Last updated · 510 distilled / 1,221 read