CONCEPT Cited by 1 source
Integration tests against real database¶
Definition¶
Integration tests against real database is the testing discipline in which a test case runs against a live instance of the actual database technology that production uses — same engine, same schema, same query planner, same constraint enforcement — rather than against a mock, an in-memory fake (H2, SQLite), or a shared staging database.
The property that distinguishes real-DB integration tests from adjacent practices:
- vs mock-based unit tests — the test exercises actual query-planner behaviour, index selection, constraint enforcement, transaction semantics, lock interaction. See concepts/mock-object-maintenance-cost.
- vs in-memory-fake tests — H2 and SQLite have their own quirks that diverge from production (different SQL dialects, different null-handling, different constraint semantics, no real locking). Passing tests against H2 means your code works on H2 — which isn't what you ship.
- vs shared staging tests — a shared staging DB has test isolation problems (one test's inserts visible to another) + contention problems (deployments block tests, tests corrupt shared state) + flakiness.
The historical barrier¶
Real-DB integration tests have always been technically possible — just usually too slow or too expensive to be practical:
- Per-test DB spin-up: booting a fresh Postgres / MySQL per test run is seconds-to-minutes. Per-test (not per-run) is prohibitive.
- Snapshot-restore test fixtures: load a 2 GB fixture for each test — minutes per test, unrunnable at scale.
- CI runner cost: full-database CI runners cost more than containerised mocked-runner fleets; teams skip the real-DB path to save on fleet spend.
So the pragmatic industry compromise has been: mocks for everything, real DB for a handful of high-value integration tests + staging smoke tests. This leaves the gap between mocks and production to be caught by staging + early production surprise.
What changes with cheap branching¶
When database branching on a copy-on-write substrate is sub-second and effectively free (Lakebase / Neon: 1.09 s for 63 MB Backstage catalog), the historical barrier collapses. Real-DB integration tests become:
- Per-PR: CI creates its own branch from production (or from a golden baseline), runs full test suite, destroys branch. Every PR validated against production-equivalent schema + data shape.
- Per-QA-tester: each QA engineer gets their own branch, can corrupt/reset at will.
- Per-developer IDE: the IDE-database-branching pattern Thoughtworks describes for Backstage — every feature branch auto-provisions a matching database branch the developer writes + tests against.
See patterns/database-branch-per-test-over-mocking for the workflow shape.
Thoughtworks' before/after framing (Backstage POC)¶
Before (traditional cycle):
- Create git branch for feature development.
- Write mock objects for every database interface
(
MockUserRepository,MockOrderService, ...) for testing purposes. - Write unit tests with a mocked or in-memory database (H2, SQLite).
- Submit PR, review, merge.
- Deploy to shared staging environment.
- Discover that the schema migration doesn't work against real data or the size of data is a blocker.
- Fix schema migration, redeploy, repeat.
After (branching-enabled):
- Create git branch — database branch created automatically in < 1 second.
- IDE connects to the real branch database immediately.
- Write code and run migrations against real live database data from the first line of code.
- Write integration tests against the real database — not database mocks.
- Multiple solutions can be experimented (rollback is trivial).
- Push + open PR — CI creates its own database branch, validates both code and schema, publishes a schema diff.
- QA team members get their own branch for destructive testing — reset in seconds.
- Merge — CD pipeline migrates upstream environments and cleans up branches.
The before-state's step 6 ("discover that the schema migration doesn't work against real data") is the load- bearing pain point cheap branching eliminates — schema migrations are validated against real data during development, not during staging deploy.
Trade-offs¶
Upsides:
- Tests catch the full range of bugs mocks miss (constraint violations, migration failures, planner surprises, lock-ordering, index scan regressions).
- Developers see production-shape data while writing code, not after deploy.
- QA destructive testing doesn't require a separate environment request.
Downsides:
- Test-time latency budget is bigger. Real-DB tests run ~1-2 orders of magnitude slower than mocked tests (ms for mocks; 10-100 ms for a real-DB hit, even with warm connection). Suites must be structured around this.
- Branch-cleanup discipline. Branches accrue storage cost for divergent pages (see concepts/copy-on-write-storage-fork); stale branches leak cost. Per-PR branch tear-down on PR close + TTL on unused branches become CI hygiene primitives.
- Sensitive data in branches. Branching production- equivalent data into developer environments raises the compliance + PII bar. Production data often needs anonymisation or synthesis before it's safe to expose to every developer. Not addressed in the Lakebase POC.
Not universally replacing mocks¶
Real-DB integration tests replace mock objects that stand in for the database. They don't replace:
- External-service mocks (payment gateways, email, third-party APIs) — not branchable from a DB primitive.
- Deterministic failure injection — a mock can reliably raise a specific exception; a real DB can't always be coaxed into producing a specific failure mode on demand.
- Unit tests of pure functions — if the code doesn't touch the DB, don't put it behind one for testing.
See concepts/mock-object-maintenance-cost for the specific subset of test-double usage this concept replaces.
Seen in¶
- sources/2026-04-30-databricks-backstage-with-lakebase — canonical first wiki instance. Thoughtworks Backstage POC articulates the before/after developer-cycle shift: a multi-step mock-based cycle with a schema-migration surprise at staging deploy, vs a branching-based cycle with real-DB tests from the first line of code. Paired with the 1.09-second / 3.78-second branching + PITR numbers that make the shift economically viable at per-developer + per-PR granularity on Lakebase.
Related¶
- concepts/mock-object-maintenance-cost — the specific cost category real-DB integration tests eliminate.
- concepts/database-branching — the substrate primitive that makes the workflow economically viable.
- concepts/copy-on-write-storage-fork — the mechanism under the cheap-branching cost envelope.
- concepts/test-feedback-loop — the broader property the discipline optimises.
- patterns/database-branch-per-test-over-mocking — the formalised workflow.
- systems/lakebase — the canonical substrate where the Thoughtworks POC demonstrates the workflow.