Skip to content

SYSTEM Cited by 2 sources

WiredTiger

Overview

WiredTiger is the B-tree-based storage engine MongoDB has used by default since MongoDB 3.2 (2015). Originally developed by the authors of Berkeley DB and acquired by MongoDB in 2014, it provides the on-disk format, the in-memory cache, MVCC (multi-version concurrency control) for document-level locking, per-collection compression, and journal / checkpoint durability for the MongoDB server.

WiredTiger is also separately documented as a TLA+-verified subject of MongoDB's VLDB 2025 "Design and Modular Verification of Distributed Transactions in MongoDB" paper (Schultz + Demirbas) — the storage interface is modelled and validated alongside the cross-shard transaction protocol.

Key components for schema-design trade-offs

  • WiredTiger cache — the in-memory buffer pool holding uncompressed pages. Sized at "the larger of 50 % of (RAM − 1 GB) or 256 MB" by default; configurable via storage.wiredTiger.engineConfig.cacheSizeGB. This is the practical working-set-memory budget for a MongoDB deployment. Exceeding it thrashes page eviction and collapses throughput onto disk-latency.
  • Block compressor (storage.wiredTiger.collectionConfig.blockCompressor) — per-collection compression algorithm for data pages. Options: snappy (default — fast, moderate ratio), zstd (higher ratio, more CPU), zlib (highest ratio, most CPU), none. See concepts/document-storage-compression.
  • Prefix compression on indexes — default on. Index keys share common prefixes within a B-tree page; WiredTiger stores the suffix + a length reference. Cache accounting is of uncompressed bytes, so prefix compression reduces disk footprint but not cache-fit.
  • Journal — group-commit WAL with a default 100 ms flush cadence. Provides the durability side of MongoDB's w: majority, j: true write concerns.
  • Checkpoints — every 60 seconds (default), WiredTiger snapshots modified pages from cache to the on-disk tree root. Crash recovery replays the journal from the last checkpoint.

Operational numbers from MongoDB case study

The MongoDB Cost of Not Knowing Part 3 load-test rig makes WiredTiger's sizing assumptions concrete:

  • 4 GB RAM on the test machine.
  • 1.5 GB WiredTiger cache allocation (per the default formula: max(0.5 * (4-1), 0.25) = 1.5 GB).
  • appV6R0's 3.13 GB _id index exceeded this cache budget and became the load-test-dominant bottleneck — "the limiting factor in this case is memory/cache rather than document size."
  • appV6R1's quarter-bucketing dropped the index to 1.22 GB, comfortably under the cache ceiling; throughput recovered.

Compressed-vs-uncompressed examples from the same article:

Revision Data (uncompressed) Storage (compressed) Ratio
appV5R3 11.96 GB 3.24 GB 3.7×
appV6R1 8.19 GB 2.34 GB 3.5×
appV6R0 11.10 GB 3.33 GB 3.3×

All presumed snappy (WiredTiger's default); the article does not explicitly name the compressor.

Adjacent surfaces

  • Cache eviction metrics in db.serverStatus().wiredTiger.cache: bytes currently in the cache, maximum bytes configured, eviction rates. Rising evictions under steady workload ⇒ working set exceeds cache.
  • Checkpoint metrics in serverStatus.wiredTiger.checkpoint: checkpoint duration + frequency. Slow checkpoints typically signal write saturation.
  • Block-manager metrics: bytes read / written by the underlying storage-layer block manager; maps directly to disk-throughput consumption.

Seen in

Last updated · 200 distilled / 1,178 read