PATTERN Cited by 1 source
CI as agent quality gate¶
Intent¶
The AI agent is inside the CI loop: when the CI pipeline runs on its PR, the agent reads the pipeline output, identifies failures, and addresses them autonomously before requesting human review. CI is not a gate humans enforce — it's a feedback channel the agent iterates against.
Canonical articulation — Atlassian Fireworks, 2026-04-24:
"CI pipeline as quality gate: Every PR runs lint, vet, tests, and Helm validation. The agent reads pipeline output and addresses failures before requesting review." (Source: sources/2026-04-24-atlassian-rovo-dev-driven-development)
Shape¶
Agent pushes PR
│
▼
[CI pipeline] ─── lint
├── vet
├── tests
└── Helm validation
│
▼
Pipeline output ──► read by agent
│ │
│ fail │ pass
▼ ▼
Agent patches code Request human review
│
└─── loop ───┘
The pipeline is unchanged from a traditional CI setup; what changes is the consumer of the pipeline output — the agent, not a human.
Why this matters¶
Without the agent reading pipeline output, you get one of two degenerate patterns:
- Human-in-the-loop for every failure. Every pipeline failure bounces back to the human, who has to diagnose it, feed the diagnosis back to the agent, re-run, wait, review output. Human review latency contaminates every CI failure.
- Agent doesn't wait for CI. Agent pushes PRs assuming CI passes; when CI fails, the PR sits until a human notices. Blocks PR-to-merge latency.
Putting the agent inside the CI loop closes both gaps: CI failures become agent-iteration inputs, not human-intervention triggers.
Tool contract — pipeline output must be agent-readable¶
The pattern requires the CI surface to expose its output in a form the agent can consume:
- Structured output preferred. Linter findings with file/line/message. Test failures with stack traces. Helm-validation errors with resource references.
- Integrated agent access. The Fireworks team uses Rovo Dev with first-party Bitbucket Pipelines integration — "the agent can raise PRs, read diffs, and monitor builds without leaving the conversation." See systems/bitbucket-pipelines.
- Low latency. The agent's iteration rhythm depends on CI turnaround — minutes, not hours.
What CI catches vs. what the adversarial reviewer catches¶
CI covers repeatable mechanical checks; the adversarial reviewer covers judgement-level bugs:
| CI catches | Adversarial sub-agent catches |
|---|---|
| Lint / style / formatting | Subtle logic bugs |
| Static analysis (vet) | Missing edge cases |
| Existing test failures | Hallucinated invariants |
| Config / Helm validity | Tautological tests |
Both tiers are necessary; neither subsumes the other. See patterns/pre-human-agent-review for the three-tier stack.
Guardrail — CI is declarative, agent doesn't edit it¶
One important safety property: the CI pipeline itself is not
the agent's property. The agent reads pipeline output; it
does not modify the pipeline definition to make failures go
away. (If the agent can edit bitbucket-pipelines.yml to skip
a failing check, the gate stops being a gate.) Treat the CI
config as a protected surface — human-controlled, versioned,
reviewed separately.
When it fits¶
- Agent-driven development. The pattern assumes the agent is the primary author of the code and the tests.
- CI has structured output. The agent needs parseable feedback. Plain-text logs work less well than structured findings.
- CI latency is minutes. Slower CI breaks the iteration rhythm; the agent ends up context-switched-out by the time the pipeline returns.
When it doesn't fit¶
- CI failures require physical access / hardware. Some tests (embedded, hardware-in-the-loop) can't be iterated on by the agent alone.
- Non-deterministic CI. If CI fails randomly, the agent's loop gets noisy — it tries to fix things that aren't broken. Non-determinism has to be handled at the CI level first.
Composition¶
- With patterns/ai-writes-own-e2e-tests — the tests the CI gate runs are themselves agent-authored.
- With patterns/pre-human-agent-review — CI is tier 2 of the three-tier pre-human review stack.
- With patterns/ci-cd-agent-guardrails — CI guardrails scale agent autonomy over time as trust is established.
- With patterns/ai-generated-fix-forward-pr — same "agent responds to failure signal" shape, applied to production incidents rather than CI failures.
Seen in¶
- sources/2026-04-24-atlassian-rovo-dev-driven-development — canonical instance. Fireworks PRs go through Bitbucket Pipelines with lint, vet, tests, and Helm validation; Rovo Dev reads pipeline output and iterates before requesting human review.