Skip to content

CONCEPT Cited by 1 source

Logless Reconfiguration

Definition

Logless reconfiguration is MongoDB's replica-set membership-change protocol that decouples configuration changes from the data replication log. Under the prior scheme, adding or removing a voting member produced a config entry that queued behind user writes in the oplog; a slow or blocked oplog would therefore delay membership changes. Under the logless scheme, configuration state is propagated on a separate channel with its own ordering and commit rules.

The idea is short. The implementation is not — without care, concurrent configuration changes fork the cluster, primaries are elected on stale terms, or new majorities lose the old majority's writes. MongoDB's stance: prove the protocol first by modelling it in TLA+, letting TLC explore "millions of interleavings", and distilling the resulting correct-behaviour envelope to a small set of invariants.

(Source: sources/2025-09-25-mongodb-carrying-complexity-delivering-agility; paper arXiv:2102.11960.)

The four distilled invariants

MongoDB's published distillation of the verified protocol:

  1. Terms block stale primaries. A primary in a lower term can no longer commit once any member has seen a higher term — stale leaders self-eject.
  2. Monotonic versions prevent forks. Config versions are monotonically increasing; two incompatible configs can't both become active.
  3. Majority votes stop minority splits. A config change requires a majority of the current voting members to adopt it before it's considered committed; minorities can't unilaterally reconfigure.
  4. The oplog-commit rule ensures durability carries forward. Committed writes under an old configuration remain committed under the new one — reconfiguration does not re-open the question of what was durable.

Taken together, the four invariants preserve linearizability of committed writes across membership changes without serialising the changes behind the data replication log.

Why this matters architecturally

Membership-change events are rare individually but load-bearing at fleet scale:

  • Elasticity — auto-scaling an Atlas cluster (adding / removing replicas) is a membership change; if that change queues behind the write log, elasticity is throughput-coupled to write load.
  • Cross-cloud + cross-region expansion — running a single replica set across AWS/GCP/Azure is architected on top of the same primitive; adding a new-cloud member is a reconfiguration.
  • Automated repair / node replacement — a node dies, the controller wants to add a replacement immediately, not wait on the oplog.

Decoupling the two channels is what lets MongoDB offer those operations as fast primitives rather than eventually-consistent primitives.

Canonical instance of "model-check then distil to invariants"

The post frames this as the repeatable recipe, not just a one-off:

"We build a mathematical model of the core logic stripped of distracting details like disk format or thread pools and ask a model checker to try every possible interleaving of events. The tool doesn't skip the 'unlikely' cases. It tries them all."

"We modeled the protocol in TLA+, explored millions of interleavings, and distilled the solution down to four invariants."

This is the recipe behind patterns/formal-methods-before-shipping: the artifact a team remembers is not the 300-line TLA+ spec but the 4-line invariant set; the model checker's role is to find the minimal set of invariants, not to serve as the runtime verifier.

The same team applied the same recipe to multi-shard transactions (Source: sources/2025-09-25-mongodb-carrying-complexity-delivering-agility; VLDB 2025 paper):

  • Modular formal specification of the multi-shard protocol in TLA+ verifying protocol correctness + snapshot isolation.
  • Automated model-based testing of the WiredTiger storage interface.
  • Analysis of permissiveness — how much concurrency is allowed within the isolation level (under-restrictive models miss bugs; over-restrictive models forbid legitimate interleavings).

Limitations / caveats

  • Four invariants ≠ all you need. They are the distillation of the correctness proof; the production implementation still needs concurrency-safe data structures, retry logic, network-timeout handling, config-state persistence, etc. Verification proves the protocol is sound; engineering proves the code follows the protocol.
  • Conformance between spec and implementation is a separate problem. MongoDB's 2020 conformance-checking experiment failed at trace-checking the Raft-like consensus protocol because of abstraction mismatch. The logless-reconfig invariants are useful as a design-time contract; keeping the C++ implementation in continuous agreement with them is ongoing research + engineering.
  • Millions of interleavings is not exhaustive at the code level. TLC explored the model's state space, not every possible C++ thread schedule. Deterministic simulation (named in the same post — fuzzing + fault injection + message reordering against real binaries) is the pair to close that gap.

Seen in

Last updated · 200 distilled / 1,178 read