PATTERN Cited by 1 source

Invariant-driven programming¶

Pattern¶

Invariant-driven programming is the discipline of, for every non-trivial piece of code, explicitly naming the invariants the code relies on, writing them down (in comments, types, assertions, or tests), and checking each invariant against the substrate semantics rather than the intended semantics.

Substrate semantics = what the machine / language / runtime / protocol / hardware actually does. Intended semantics = what the programmer meant.

When the two diverge, bugs live in the gap. The invariant-driven programmer checks every invariant in the substrate's frame, not in their own mental frame.

The canonical instance: `(low + high) / 2`¶

From Bloch, 2006, quoting Jon Bentley's description of the binary-search midpoint line:

[this line] sets m to the average of l and u, truncated down to the nearest integer.

The stated invariant — "mid is the average of low and high, truncated down" — is:

True in the mathematical integers.
False in the 32-bit signed int machine representation, once low + high > INT_MAX.

The invariant was never checked against the substrate. Bentley proved it correct. Bloch implemented it in the JDK. Reviewers read it. Tests passed on small inputs. The bug survived for ~20 years in Programming Pearls and ~9 years in java.util.Arrays.binarySearch.

Bloch's opening paragraph is the statement of this pattern:

I remember vividly Jon Bentley's first Algorithms lecture at CMU… The key lesson was to carefully consider the invariants in your programs.

The irony — Bentley's lesson is the lesson; Bentley's code is the counterexample — is the wiki mnemonic for this pattern.

The discipline¶

For each piece of code you write, ask:

What invariants does this rely on? Range of inputs, ordering, nullness, thread-ownership, monotonic sequence, freshness, referential integrity, allocation alignment, memory ordering, etc.
For each invariant, what substrate is it true in? The integers? 32-bit signed? 64-bit unsigned? IEEE 754 doubles? A TCP ordered-delivery-within-a-connection? A single-writer WAL? Strong / weak / eventual consistency? A cache with TTL? An HTTP request where X-Forwarded-For is attacker-controlled?
Is the substrate I actually have the substrate the invariant requires? If not — make them match (widen the type, switch algorithm, tighten the contract, add a guard) or write the check.

The four substrate-gap bug classes¶

Wiki taxonomy of substrate mismatches, all covered by invariant-driven discipline:

Arithmetic-overflow gap — "average of two ints" isn't, for large values (concepts/integer-overflow). Fix: patterns/safe-midpoint-computation.
Memory-safety gap — "C pointer + length" is a safe buffer only if the convention is honoured (concepts/memory-safety). Rust's response: make the invariant part of the type so it's uncompilable to violate it.
Concurrency gap — "the caller holds the lock" is a convention unless enforced. concepts/deterministic-simulation makes the check reproducible; patterns/property-based-testing makes it exhaustive.
Protocol / distributed-state gap — "the reader sees the write" requires a consistency model the storage actually offers; eventual-consistency substrate + read-your-writes invariant = bug. concepts/strong-consistency / concepts/wal-write-ahead-logging are substrates chosen to satisfy specific invariants.

Techniques — how to operationalize¶

Type the invariant. If a value must be non-negative, use an unsigned type; if it must be sorted, use a sorted-collection type; if it must be non-null, use an Option / Maybe. The systems/aurora-dsql Rust-for-Postgres-extensions decision is an instance: encode the C-convention invariants into Rust types so violating them is uncompilable.
Assert the invariant. Write assert ! / invariant ! / debug_assert ! at every boundary where the invariant is entered or exited. Debug builds catch the violation loudly; property-based tests exercise the boundary.
Property-test the invariant (patterns/property-based-testing). Don't enumerate cases — describe the invariant as a universal property and let the test harness generate inputs that try to break it. Dropbox Nucleus is the canonical wiki instance.
Formal-model the invariant (concepts/lightweight-formal-verification). For durability-critical code, write an executable spec in the same repo + language; check the production implementation against it on every commit. ShardStore is the canonical wiki instance.
Document the invariant where the code lives. A comment above (low + high) / 2 saying "invariant: low + high fits in int" would have surfaced the bug at review time. Writing the invariant down is itself a debugging technique — half the bugs get fixed while writing the comment.

The epistemic consequence¶

From Bloch's closing paragraph:

Careful design is great. Testing is great. Formal methods are great. Code reviews are great. Static analysis is great. But none of these things alone are sufficient to eliminate bugs: They will always be with us.

Invariant-driven programming is not a silver bullet; it's a discipline that makes each of the other techniques more effective by giving them something concrete to check. Typed invariants make types catch more; asserted invariants make tests catch more; property-based invariants make generative testing cover more; formal-spec invariants make verification industrializable.

Seen in¶

sources/2025-01-11-google-nearly-all-binary-searches-and-mergesorts-are-broken-2006 — canonical: Bentley's "consider the invariants" lesson introduced in paragraph 1; canonical bug (invariant checked in mathematical-integer substrate, violated in 32-bit-int substrate) illustrated by paragraph 2 onward; meta-lesson ("program carefully, defensively, remain ever vigilant") closing the post.