PATTERN Cited by 1 source
Tests as executable specifications¶
Pattern¶
Treat the test suite not just as a regression net, but as the behavioral specification of the system — a corpus of executable assertions that both human reviewers and AI agents read to infer what the system is supposed to do. When a test fails, the failure itself is a teaching signal: the agent reads the assertion, the inputs, the expected output, and can often refine its change without re-prompting.
Named by the 2026-03-26 AWS Architecture Blog post:
"In agentic workflows, tests do more than catch regressions, they define acceptable behavior."
"Well-written tests also act as documentation. When a test fails, the agent can infer what behavior is expected and refine its changes accordingly."
Why it works for agents¶
- Tests are formally structured. Name, arrange, act, assert — the agent can parse structure in a way it can't parse a prose docstring.
- Failure output is diagnostic. "Expected X, got Y" is a machine-readable hint; the agent uses it to steer the next edit.
- Scope is implicit. A test suite's boundaries tell the agent what's in-contract and what isn't.
- Specs stay executable, not wishful. Prose docs rot; tests that rot fail CI.
Test-quality properties that make this work¶
- Clear names —
shouldRejectNegativeAmountsteaches more thantest123. - One assertion per test — failure tells the agent which property broke.
- Realistic fixtures — not so stripped-down the test teaches a wrong model of the domain.
- Tight arrange/act/assert separation — the agent can read the inputs, action, and oracle independently.
Sibling patterns in the wiki¶
- patterns/executable-specification — the formal-spec tier: a compact model of the system (~1% the size) in the production language, validated via property-based testing (systems/shardstore). This pattern (tests-as-spec) is the test-suite-not-formal-spec tier of the same idea.
- patterns/test-case-generation-from-spec — inverse direction: generate tests from a spec.
- concepts/specification-driven-development — the broader concept this pattern is a concrete realization of at the test-suite tier.
Pairs with¶
- patterns/layered-testing-strategy — unit + contract + smoke tests, each tier a different spec granularity.
- concepts/contract-first-design — contract tests verify a formal contract the agent can also consult directly.
Caveats¶
- Bad tests teach wrong behavior. If tests lie (flaky, wrong assertions, inverted expectations), the agent learns the lie. Test quality is load-bearing under this pattern.
- Coverage gaps are invisible teachers. Absent tests look like "no spec"; the agent may invent plausible-but-wrong behavior. Pair with concepts/specification-driven-development for the formal-spec tier.
- No measurement in the 2026-03-26 source of how much agent behavior improves under well-written tests vs poorly-written ones.
Seen in¶
- sources/2026-03-26-aws-architecting-for-agentic-ai-development-on-aws — pattern introduction; "tests do more than catch regressions, they define acceptable behavior".