PATTERN Cited by 1 source

CI as agent quality gate¶

Intent¶

The AI agent is inside the CI loop: when the CI pipeline runs on its PR, the agent reads the pipeline output, identifies failures, and addresses them autonomously before requesting human review. CI is not a gate humans enforce — it's a feedback channel the agent iterates against.

Canonical articulation — Atlassian Fireworks, 2026-04-24:

"CI pipeline as quality gate: Every PR runs lint, vet, tests, and Helm validation. The agent reads pipeline output and addresses failures before requesting review." (Source: sources/2026-04-24-atlassian-rovo-dev-driven-development)

Shape¶

   Agent pushes PR
          │
          ▼
   [CI pipeline] ─── lint
                 ├── vet
                 ├── tests
                 └── Helm validation
          │
          ▼
   Pipeline output ──► read by agent
          │                │
          │ fail            │ pass
          ▼                ▼
   Agent patches code  Request human review
          │
          └─── loop ───┘

The pipeline is unchanged from a traditional CI setup; what changes is the consumer of the pipeline output — the agent, not a human.

Why this matters¶

Without the agent reading pipeline output, you get one of two degenerate patterns:

Human-in-the-loop for every failure. Every pipeline failure bounces back to the human, who has to diagnose it, feed the diagnosis back to the agent, re-run, wait, review output. Human review latency contaminates every CI failure.
Agent doesn't wait for CI. Agent pushes PRs assuming CI passes; when CI fails, the PR sits until a human notices. Blocks PR-to-merge latency.

Putting the agent inside the CI loop closes both gaps: CI failures become agent-iteration inputs, not human-intervention triggers.

Tool contract — pipeline output must be agent-readable¶

The pattern requires the CI surface to expose its output in a form the agent can consume:

Structured output preferred. Linter findings with file/line/message. Test failures with stack traces. Helm-validation errors with resource references.
Integrated agent access. The Fireworks team uses Rovo Dev with first-party Bitbucket Pipelines integration — "the agent can raise PRs, read diffs, and monitor builds without leaving the conversation." See systems/bitbucket-pipelines.
Low latency. The agent's iteration rhythm depends on CI turnaround — minutes, not hours.

What CI catches vs. what the adversarial reviewer catches¶

CI covers repeatable mechanical checks; the adversarial reviewer covers judgement-level bugs:

CI catches	Adversarial sub-agent catches
Lint / style / formatting	Subtle logic bugs
Static analysis (vet)	Missing edge cases
Existing test failures	Hallucinated invariants
Config / Helm validity	Tautological tests

Both tiers are necessary; neither subsumes the other. See patterns/pre-human-agent-review for the three-tier stack.

Guardrail — CI is declarative, agent doesn't edit it¶

One important safety property: the CI pipeline itself is not the agent's property. The agent reads pipeline output; it does not modify the pipeline definition to make failures go away. (If the agent can edit bitbucket-pipelines.yml to skip a failing check, the gate stops being a gate.) Treat the CI config as a protected surface — human-controlled, versioned, reviewed separately.

When it fits¶

Agent-driven development. The pattern assumes the agent is the primary author of the code and the tests.
CI has structured output. The agent needs parseable feedback. Plain-text logs work less well than structured findings.
CI latency is minutes. Slower CI breaks the iteration rhythm; the agent ends up context-switched-out by the time the pipeline returns.

When it doesn't fit¶

CI failures require physical access / hardware. Some tests (embedded, hardware-in-the-loop) can't be iterated on by the agent alone.
Non-deterministic CI. If CI fails randomly, the agent's loop gets noisy — it tries to fix things that aren't broken. Non-determinism has to be handled at the CI level first.

Composition¶

With patterns/ai-writes-own-e2e-tests — the tests the CI gate runs are themselves agent-authored.
With patterns/pre-human-agent-review — CI is tier 2 of the three-tier pre-human review stack.
With patterns/ci-cd-agent-guardrails — CI guardrails scale agent autonomy over time as trust is established.
With patterns/ai-generated-fix-forward-pr — same "agent responds to failure signal" shape, applied to production incidents rather than CI failures.

Seen in¶

sources/2026-04-24-atlassian-rovo-dev-driven-development — canonical instance. Fireworks PRs go through Bitbucket Pipelines with lint, vet, tests, and Helm validation; Rovo Dev reads pipeline output and iterates before requesting human review.