SYSTEM Cited by 2 sources
WiredTiger¶
Overview¶
WiredTiger is the B-tree-based storage engine MongoDB has used by default since MongoDB 3.2 (2015). Originally developed by the authors of Berkeley DB and acquired by MongoDB in 2014, it provides the on-disk format, the in-memory cache, MVCC (multi-version concurrency control) for document-level locking, per-collection compression, and journal / checkpoint durability for the MongoDB server.
WiredTiger is also separately documented as a TLA+-verified subject of MongoDB's VLDB 2025 "Design and Modular Verification of Distributed Transactions in MongoDB" paper (Schultz + Demirbas) — the storage interface is modelled and validated alongside the cross-shard transaction protocol.
Key components for schema-design trade-offs¶
- WiredTiger cache — the
in-memory buffer pool holding uncompressed pages. Sized at
"the larger of 50 % of (RAM − 1 GB) or 256 MB" by default;
configurable via
storage.wiredTiger.engineConfig.cacheSizeGB. This is the practical working-set-memory budget for a MongoDB deployment. Exceeding it thrashes page eviction and collapses throughput onto disk-latency. - Block compressor (
storage.wiredTiger.collectionConfig.blockCompressor) — per-collection compression algorithm for data pages. Options:snappy(default — fast, moderate ratio),zstd(higher ratio, more CPU),zlib(highest ratio, most CPU),none. See concepts/document-storage-compression. - Prefix compression on indexes — default on. Index keys share common prefixes within a B-tree page; WiredTiger stores the suffix + a length reference. Cache accounting is of uncompressed bytes, so prefix compression reduces disk footprint but not cache-fit.
- Journal — group-commit WAL with a default 100 ms flush
cadence. Provides the durability side of MongoDB's
w: majority,j: truewrite concerns. - Checkpoints — every 60 seconds (default), WiredTiger snapshots modified pages from cache to the on-disk tree root. Crash recovery replays the journal from the last checkpoint.
Operational numbers from MongoDB case study¶
The MongoDB Cost of Not Knowing Part 3 load-test rig makes WiredTiger's sizing assumptions concrete:
- 4 GB RAM on the test machine.
- 1.5 GB WiredTiger cache allocation (per the default
formula:
max(0.5 * (4-1), 0.25) = 1.5 GB). - appV6R0's 3.13 GB
_idindex exceeded this cache budget and became the load-test-dominant bottleneck — "the limiting factor in this case is memory/cache rather than document size." - appV6R1's quarter-bucketing dropped the index to 1.22 GB, comfortably under the cache ceiling; throughput recovered.
Compressed-vs-uncompressed examples from the same article:
| Revision | Data (uncompressed) | Storage (compressed) | Ratio |
|---|---|---|---|
| appV5R3 | 11.96 GB | 3.24 GB | 3.7× |
| appV6R1 | 8.19 GB | 2.34 GB | 3.5× |
| appV6R0 | 11.10 GB | 3.33 GB | 3.3× |
All presumed snappy (WiredTiger's default); the article does not explicitly name the compressor.
Adjacent surfaces¶
- Cache eviction metrics in
db.serverStatus().wiredTiger.cache:bytes currently in the cache,maximum bytes configured, eviction rates. Rising evictions under steady workload ⇒ working set exceeds cache. - Checkpoint metrics in
serverStatus.wiredTiger.checkpoint: checkpoint duration + frequency. Slow checkpoints typically signal write saturation. - Block-manager metrics: bytes read / written by the underlying storage-layer block manager; maps directly to disk-throughput consumption.
Seen in¶
-
sources/2026-02-27-mongodb-towards-model-based-verification-of-a-key-value-storage-engine — canonical wiki reference for WiredTiger as the target of model-based conformance checking. MongoDB's VLDB 2025 work extracts the interface boundary between the cross-shard transaction protocol and WiredTiger as a standalone
Storage.tlaspec (concepts/compositional-specification), enumerates its complete reachable state graph via a modified TLC, computes path coverings, and emits one test case per path as a sequence of WiredTiger API calls (patterns/test-case-generation-from-spec). Concrete result: 87,143 tests for a 2-key × 2-transaction finite model, generated + executed against WiredTiger in ~40 minutes. Specs + generator open-sourced at mongodb-labs/vldb25-dist-txns. Caveats: the current spec covers a subset of WiredTiger's API semantics; the finite model is intentionally tiny; full technical detail in the VLDB paper. -
sources/2025-10-09-mongodb-cost-of-not-knowing-mongodb-part-3-appv6r0-to-appv6r4 — WiredTiger's 1.5 GB cache on a 4 GB-RAM test machine is the specific budget the appV6R0
_idindex overflowed, driving the pivot to quarter-bucketing in appV6R1. WiredTiger's default snappy compression (implicit throughout the article) is what produces the ~3.3-3.7× compression ratios observed across theappV5RX/appV6RXfamily'sdatavsstoragesizes.