CONCEPT Cited by 1 source
Test pyramid¶
Definition¶
A test pyramid (Mike Cohn, popularised by Martin Fowler) is a shape heuristic for a test suite's composition. From widest (foundation) to narrowest (apex):
- Unit tests — fastest, most numerous. One module in isolation.
- Component / service tests — mid-layer. One service with its dependencies stubbed or faked.
- Integration tests — narrower. Exercise the boundary between your code and one real external system (DB, HTTP peer, queue, object store). Pay real-engine / real-wire cost; in return, catch bugs unit tests can't.
- System / E2E / manual tests — narrowest. Full stack, expensive to write and run, flaky.
The shape encodes a cost/coverage tradeoff: each layer up is slower, more flaky, and more expensive to maintain, so fewer tests should live there. Bug-detection coverage is inverted — higher-layer tests catch categories lower ones miss (wiring, real-engine behaviour, environment) but at a high cost per test.
Worked Zalando ratio¶
Zalando Marketing Services uses ≈ 25% integration tests relative to unit tests as a rule of thumb, with the explicit caveat that it "varies from application to application". This sits near the middle of the industry spread (Google's original heuristic was 70 / 20 / 10 for small / medium / large); the specific ratio matters less than the shape. (Source: sources/2021-02-24-zalando-integration-tests-with-testcontainers)
Common inversions (anti-pyramid)¶
- Ice-cream cone: lots of manual / E2E, few unit. Slow, flaky, expensive; catches surface regressions at enormous cost per bug.
- Hourglass: lots of unit + lots of E2E, missing middle. Unit tests say the pieces work; E2E say the whole works; but integration-layer bugs (wire format drift, real-engine corner cases) fall through.
- Square: equal counts at every layer. Usually means the lower layers aren't being invested in.
Why integration tests can't replace unit tests¶
- Startup cost. A unit test takes milliseconds; an integration test against a real Postgres container takes seconds. Suite-time budget means IT count has to stay bounded.
- Failure diagnosis. Unit test failures localise; IT failures implicate the whole wire + dependency path.
- Flakiness risk compounds. Each real dependency adds a failure mode. Unit tests have near-zero infra flakiness.
Why unit tests can't replace integration tests¶
- Stub drift. A unit test that stubs the database cannot
detect that the production DB rejects the query plan, that
json_aggdoesn't do what the stub said, or that a migration broke. - Wire-format edge cases. HTTP peer returns 5xx, times out, breaks connection mid-response — only a real server exposes those.
- Spring context / DI wiring. A unit test doesn't exercise
the bean graph; an IT does. Even one
contextLoads()IT detects wiring regressions and Flyway migration failures (Zalando call-out).
Seen in¶
- sources/2021-02-24-zalando-integration-tests-with-testcontainers — Zalando ZMS anchors its testing discipline on Fowler's pyramid; uses ~25% IT to unit as the team heuristic. Canonicalises the structural role of the pyramid in justifying Testcontainers investment.
Related¶
- concepts/first-test-principles — the FIRST properties every layer still has to satisfy.
- concepts/automated-vs-manual-testing-complementarity — the manual-tests layer's residual role.
- patterns/property-based-testing — a technique that thickens the unit layer with more coverage per test.
- patterns/real-docker-container-over-in-memory-fake — the integration-layer implementation choice.
- patterns/failsafe-integration-test-separation — Maven plumbing that runs the two layers in different phases.