SYSTEM Cited by 2 sources
TLC Model Checker¶
Overview¶
TLC is the explicit-state finite-state model-checker for TLA+. Given a TLA+ spec (with a bounded configuration), TLC enumerates every reachable state and checks that every stated invariant holds. If an invariant is violated, TLC prints the shortest counter-example behaviour (sequence of actions from the initial state to the violation).
What TLC does for conformance checking¶
Spec-side checking (classic use)¶
- Run the spec against invariants — TLC enumerates the reachable state graph; verifies every invariant (safety: bad states never reached) and liveness property (good states eventually reached).
- Find counterexample behaviours — when an invariant fails, TLC emits the exact shortest path of actions that led to the violation.
Trace-checking oracle (patterns/trace-checking)¶
Following Ron Pressler's technique, production execution traces are
translated into Trace.tla files — specialised TLA+ spec encoding the
exact sequence of observed states. TLC runs the trace spec against the
real spec with the property Trace conforms to Spec; a rejected trace
means the impl did something the spec forbids.
MongoDB 2020 used this shape unsuccessfully (Source: sources/2025-06-02-mongodb-conformance-checking-at-mongodb-testing-that-our-code-matches-our-tla-specs): the mechanics worked, but the RaftMongo.tla spec was at a higher abstraction than the real implementation (atomic leader handoff vs. step-down-then-step-up), so traces never conformed.
Test-case-generation source (patterns/test-case-generation-from-spec)¶
TLC can emit the entire reachable state graph as a DAG of actions. A generator program can parse this and produce one unit test per behaviour. MongoDB 2020 used this for the Mobile SDK's OT merge rules:
- State space constrained to 30,184 states / 4,913 behaviours.
- Generator emitted 4,913 C++ unit tests.
- Tests achieved 100% branch coverage on the OT impl.
- Crucially, TLC crashed with a Java StackOverflowError on one conflict case (element-swap + element-move) due to infinite recursion in the algorithm itself — caught a real bug in the C++ that had survived years of unit tests and AFL-fuzzer runs.
Named feature additions since 2020¶
Davis 2025 calls out TLC additions that change the tractability of trace-checking:
- Incomplete trace inference — log only some state variables; TLC infers the missing ones. Announced with the 2024 Validating Traces of Distributed Programs Against TLA+ Specifications paper. Reduces the atomic-snapshot burden that killed MongoDB's 2020 attempt.
- State-graph export for test-case generation — issue #1073 tracks better export formats for programmatic consumption.
Limits¶
- Explicit-state enumeration — TLC stores every reachable state in memory (with disk spill). State-space explosion caps the practical constraint shape. For test-case generation, this means the impl constraint (number of participants, message sizes, etc.) has to be small enough for TLC to finish.
- Finite configuration only — TLC checks bounded specs. Unbounded specs (e.g. "any number of servers") require auxiliary tooling (TLAPS theorem prover, Apalache, or hand-written proofs) for full checking.
- Liveness under fairness — TLC handles liveness properties but often needs explicit fairness assumptions that bloat the state space.
Seen in¶
- sources/2025-06-02-mongodb-conformance-checking-at-mongodb-testing-that-our-code-matches-our-tla-specs —
MongoDB 2020 used TLC as both the trace-checker (failed — abstraction
mismatch) and the state-graph enumerator for test-case generation
(succeeded — 4,913 tests, 100% branch coverage, caught one infinite-
recursion bug via Java
StackOverflowError). - sources/2026-02-27-mongodb-towards-model-based-verification-of-a-key-value-storage-engine —
MongoDB's VLDB 2025 WiredTiger conformance
story makes use of a modified TLC to enumerate the complete
reachable state graph of
Storage.tla(the storage-interface module of the team's compositional cross-shard transactions spec), then computes path coverings for test-case generation. Concrete scale: 87,143 tests for a 2-key × 2-transaction finite model, generated + executed in ~40 minutes. The state-graph-export capability called for by issue #1073 in the 2020 retrospective is the enabling mechanism; this post is an order of magnitude larger a case than the 2020 OT experiment. Roadmap notes approximate-coverage strategies (randomized path sampling) as the expected scaling lever beyond complete enumeration.