CONCEPT Cited by 1 source
H2 vs real database testing¶
Definition¶
A recurring testing-infrastructure choice: run JVM tests
against an embedded, in-memory SQL emulator (classically
H2, also HSQLDB / Derby) configured in
MySQL/PostgreSQL compatibility mode, or against a real
database engine (Postgres, MySQL) running in a Docker
container via systems/testcontainers.
H2 is ~10× faster to start (~0.4 s vs ~4 s Postgres on Zalando's author machine — sources/2021-02-24-zalando-integration-tests-with-testcontainers) and needs no Docker daemon. But the compatibility-mode emulation is partial: it re-implements SQL syntax but not engine internals. Any test that exercises engine-specific behaviour can pass on H2 and fail in production.
What H2 misses for Postgres¶
Non-exhaustive:
jsonboperators —@>,?,?&,?|,#>,jsonb_path_query. H2 has a JSON type but different operator semantics.- Partitioning — declarative table partitioning
(
PARTITION BY RANGE / LIST / HASH), partition pruning, partition-wise joins. H2 has none of this. LATERALjoins — planner behaviour differs.- PL/pgSQL stored procedures — H2 uses Java; any
CREATE FUNCTION … LANGUAGE plpgsqlsilently no-ops or errors. - Locking + isolation level semantics — Postgres's MVCC
(row-level locks,
SELECT … FOR UPDATEwithNOWAIT/SKIP LOCKED, serializable snapshot isolation) is not what H2 does. - Specific error codes / constraint error shapes — tests that pattern-match on SQLSTATE or exception class produce false negatives.
- Full-text search (
tsvector,tsquery, GIN), PostGIS, extensions in general. - Replication semantics, logical decoding — can't be tested at all on H2.
- Migration scripts — Flyway
.sqlfiles with Postgres-specific syntax (DO $$ … $$,GENERATED ALWAYS AS IDENTITY,CREATE EXTENSION) fail on H2.
The failure mode H2 produces¶
A green test suite + red production. The worst variant is
passing tests for intentionally Postgres-specific code:
the team writes a jsonb query, tests it against H2 (in MySQL
or Postgres compat mode), the test passes because H2's JSON
behaviour coincidentally agrees on the happy path — and a
corner case (null-vs-missing key, nested-array operators)
breaks in production.
When H2 is still OK¶
- Pure unit tests of SQL-generation code that never actually executes (e.g. build-a-SQL-string logic).
- Super-thin DAO tests that only
SELECT * FROM t WHERE id = ?with no engine-specific features — but the gain over a pure unit test is marginal. - DB-unaware tests that happen to need a JDBC connection (wiring tests) and explicitly do not test DB behaviour.
What real-container testing costs¶
- ~4 s Postgres startup on a modern dev laptop. Multiplied over hundreds of tests without singleton sharing, this dominates test-suite time.
- Docker requirement — developer laptops and CI workers must have Docker Engine or a compatible runtime.
- CI fleet sizing — build hosts need more RAM and CPU; Localstack alone can eat GB of RAM.
The amortisation answer is the concepts/singleton-container-pattern: one container per JVM, shared across every test.
Seen in¶
- sources/2021-02-24-zalando-integration-tests-with-testcontainers — Zalando ZMS names the trade-off explicitly: "real PostgreSQL instead of the H2 database, which doesn't support the Postgres-specific functionality (e.g. partitioning or JSON operations)". 4 s vs 0.4 s startup cost disclosed as the tax.
Related¶
- patterns/real-docker-container-over-in-memory-fake — the pattern this concept motivates.
- systems/postgresql — the real engine.
- systems/testcontainers — the mechanism.
- concepts/test-pyramid — the layer this decision lives in.