CONCEPT Cited by 1 source
WiredTiger cache¶
Definition¶
The WiredTiger cache is the in-memory buffer pool that MongoDB's WiredTiger storage engine uses to hold uncompressed data + index pages. It is the primary determinant of which queries hit memory vs. disk on a MongoDB server — effectively the working-set memory budget for the database.
Default sizing¶
MongoDB's WiredTiger documentation formula is "the larger of either 50 % of (RAM - 1 GB) or 256 MB." Concrete examples:
- 4 GB RAM → 1.5 GB cache (this is the hardware envelope in the MongoDB Cost of Not Knowing Part 3 load test).
- 8 GB RAM → 3.5 GB cache.
- 16 GB RAM → 7.5 GB cache.
- 64 GB RAM → 31.5 GB cache.
Only roughly half of RAM is allocated because the other half is left for OS filesystem cache, aggregation pipeline sorts / group temporary data, connections, replication buffers, and the rest of the MongoDB server's working memory.
Tunable via storage.wiredTiger.engineConfig.cacheSizeGB in
mongod.conf or --wiredTigerCacheSizeGB.
Why it's load-bearing for schema design¶
WiredTiger stores each index in its own B-tree; pages are loaded into the cache on access and evicted on an LRU basis. When a query needs an index page that isn't cache-resident, it incurs a disk read (SSD: microseconds; HDD: milliseconds). At sustained load, queries whose index footprint exceeds the cache degrade sharply — the cache is thrashed, eviction rates rise, and every query tends toward the disk-latency baseline.
This makes index size vs cache size the single most important capacity-planning number for MongoDB workloads: an index that fits in cache and one that doesn't differ by orders of magnitude in steady-state latency, even if everything else (data layout, query pattern, write volume) is identical.
Canonical instance of the trap¶
The MongoDB Cost of Not Knowing Part 3 case study surfaces this as the new bottleneck after documents were shrunk:
- appV6R0 (monthly-bucketed, dynamic schema) achieved 125 B
average documents vs. appV5R3's 385 B — a 67.5 % shrink. But
the approach also tripled the document count (95.3 M vs
33.4 M), and the
_idindex grew to 3.13 GB, exceeding the 4 GB-RAM machine's 1.5 GB WiredTiger cache allocation. The load test showed the expected disk-throughput win from smaller documents didn't materialize because index pages couldn't stay resident. - appV6R1 pivoted to quarterly-bucketing with the same
dynamic-schema trick, dropping the
_idindex to 1.22 GB — back under the cache ceiling. Total per-event size fell from appV5R3's 28.1 B to 20.2 B; load-test throughput improved.
The general lesson: shrinking documents doesn't automatically help if the new bottleneck is index-in-cache. The dominant storage-on-disk dimension has to match the new performance bottleneck's dimension.
Cache-friendly schema heuristics¶
Drawn from the MongoDB case study and general practice:
- Right-size bucket width to push index size under cache.
Wider buckets = fewer documents = smaller
_idindex. Narrower buckets = smaller documents = better disk-throughput profile but larger index. Which wins depends on what's saturated. - Keep secondary-index cardinality in check. Each secondary
index occupies its own cache budget. appV4+ in the case study
uses a single
_idindex for exactly this reason — packing all query predicates into the_idkeeps total index pressure tied strictly to document count. - Benchmark on representative data volumes. A 1 GB dev collection fits entirely in cache on any modern machine and gives no signal about production behaviour at 100 GB. The case study's 500 M-event / 4 GB-RAM rig is deliberately sized so cache pressure is visible at reasonable test durations.
- Monitor
wiredTiger.cachebytes currently in the cachevsmaximum bytes configured, plusevictionstatistics inserverStatusoutput. A rising eviction rate under steady workload is the canonical symptom of exceeding cache.
Seen in¶
- sources/2025-10-09-mongodb-cost-of-not-knowing-mongodb-part-3-appv6r0-to-appv6r4 —
cache ceiling at 1.5 GB on 4 GB RAM is the specific resource
appV6R0's 3.13 GB
_idindex overflowed, driving the pivot to appV6R1's quarter-bucket schema. Named in-article: "this is near the 4GB of available memory on the machine running the database and exceeds the 1.5GB allocated by WiredTiger for cache … the limiting factor in this case is memory/cache rather than document size."