Skip to content

PATTERN Cited by 1 source

Layered testing strategy

Pattern

Compose a tiered test suite where each layer tests a different scope at a different cost-latency trade-off: cheap + fast at the domain-logic base, progressively more integrated and expensive as you climb. Each tier catches a distinct class of failure; running all three gives the agent a behavioral oracle at every cost tier.

Named by the 2026-03-26 AWS Architecture Blog post:

"In agentic workflows, tests do more than catch regressions, they define acceptable behavior. A layered testing strategy works particularly well: - Unit tests validate domain logic in isolation and run quickly, making them ideal for frequent AI-driven iterations. - Contract tests verify that services honor agreed interfaces, catching breaking changes early. - Smoke tests run against deployed environments to surface configuration or permission issues that only appear at runtime, such as missing AWS Identity and Access Management (IAM) permissions."

The three-layer stack

Tier Scope Latency What fails here that nowhere else does
Unit domain logic in isolation (no cloud, no SDK) milliseconds incorrect business rules, edge-case math, validation holes
Contract interfaces between services (OpenAPI / Smithy / Protobuf conformance) seconds incompatible request shapes, response schema breakage, version skew
Smoke deployed environment seconds–minutes missing IAM perms, wrong region, missing env var, config bug

The shape is a classic pyramid: many unit tests + some contract tests + a thin smoke layer. Agents iterate mostly at the bottom.

Why agents especially need it

  • Unit tests are the agent's inner-loop oracle. At millisecond latency the agent can validate dozens of proposed changes per minute.
  • Contract tests catch cross-service breakage early — when the agent is editing both sides of an API, contract tests fail fast rather than surfacing at integration time.
  • Smoke tests close the runtime gap. Even perfect unit + contract coverage can't catch "missing IAM permissions" or "wrong S3 bucket policy" — these require a deployed environment per the 2026-03-26 post.

Pairs with

Caveats

  • No coverage / runtime-cost numbers in the 2026-03-26 source.
  • Real test pyramids have more layers in practice (integration, E2E, chaos, load). The AWS post keeps to three for clarity; the spirit generalizes to any cost-latency-sorted tiering.

Seen in

Last updated · 200 distilled / 1,178 read