PATTERN Cited by 1 source

Tests as executable specifications¶

Pattern¶

Treat the test suite not just as a regression net, but as the behavioral specification of the system — a corpus of executable assertions that both human reviewers and AI agents read to infer what the system is supposed to do. When a test fails, the failure itself is a teaching signal: the agent reads the assertion, the inputs, the expected output, and can often refine its change without re-prompting.

Named by the 2026-03-26 AWS Architecture Blog post:

"In agentic workflows, tests do more than catch regressions, they define acceptable behavior."

"Well-written tests also act as documentation. When a test fails, the agent can infer what behavior is expected and refine its changes accordingly."

Why it works for agents¶

Tests are formally structured. Name, arrange, act, assert — the agent can parse structure in a way it can't parse a prose docstring.
Failure output is diagnostic. "Expected X, got Y" is a machine-readable hint; the agent uses it to steer the next edit.
Scope is implicit. A test suite's boundaries tell the agent what's in-contract and what isn't.
Specs stay executable, not wishful. Prose docs rot; tests that rot fail CI.

Test-quality properties that make this work¶

Clear names — shouldRejectNegativeAmounts teaches more than test123.
One assertion per test — failure tells the agent which property broke.
Realistic fixtures — not so stripped-down the test teaches a wrong model of the domain.
Tight arrange/act/assert separation — the agent can read the inputs, action, and oracle independently.

Sibling patterns in the wiki¶

patterns/executable-specification — the formal-spec tier: a compact model of the system (~1% the size) in the production language, validated via property-based testing (systems/shardstore). This pattern (tests-as-spec) is the test-suite-not-formal-spec tier of the same idea.
patterns/test-case-generation-from-spec — inverse direction: generate tests from a spec.
concepts/specification-driven-development — the broader concept this pattern is a concrete realization of at the test-suite tier.

Pairs with¶

patterns/layered-testing-strategy — unit + contract + smoke tests, each tier a different spec granularity.
concepts/contract-first-design — contract tests verify a formal contract the agent can also consult directly.

Caveats¶

Bad tests teach wrong behavior. If tests lie (flaky, wrong assertions, inverted expectations), the agent learns the lie. Test quality is load-bearing under this pattern.
Coverage gaps are invisible teachers. Absent tests look like "no spec"; the agent may invent plausible-but-wrong behavior. Pair with concepts/specification-driven-development for the formal-spec tier.
No measurement in the 2026-03-26 source of how much agent behavior improves under well-written tests vs poorly-written ones.

Seen in¶

sources/2026-03-26-aws-architecting-for-agentic-ai-development-on-aws — pattern introduction; "tests do more than catch regressions, they define acceptable behavior".