Skip to content

CONCEPT Cited by 1 source

WiredTiger cache

Definition

The WiredTiger cache is the in-memory buffer pool that MongoDB's WiredTiger storage engine uses to hold uncompressed data + index pages. It is the primary determinant of which queries hit memory vs. disk on a MongoDB server — effectively the working-set memory budget for the database.

Default sizing

MongoDB's WiredTiger documentation formula is "the larger of either 50 % of (RAM - 1 GB) or 256 MB." Concrete examples:

  • 4 GB RAM → 1.5 GB cache (this is the hardware envelope in the MongoDB Cost of Not Knowing Part 3 load test).
  • 8 GB RAM → 3.5 GB cache.
  • 16 GB RAM → 7.5 GB cache.
  • 64 GB RAM → 31.5 GB cache.

Only roughly half of RAM is allocated because the other half is left for OS filesystem cache, aggregation pipeline sorts / group temporary data, connections, replication buffers, and the rest of the MongoDB server's working memory.

Tunable via storage.wiredTiger.engineConfig.cacheSizeGB in mongod.conf or --wiredTigerCacheSizeGB.

Why it's load-bearing for schema design

WiredTiger stores each index in its own B-tree; pages are loaded into the cache on access and evicted on an LRU basis. When a query needs an index page that isn't cache-resident, it incurs a disk read (SSD: microseconds; HDD: milliseconds). At sustained load, queries whose index footprint exceeds the cache degrade sharply — the cache is thrashed, eviction rates rise, and every query tends toward the disk-latency baseline.

This makes index size vs cache size the single most important capacity-planning number for MongoDB workloads: an index that fits in cache and one that doesn't differ by orders of magnitude in steady-state latency, even if everything else (data layout, query pattern, write volume) is identical.

Canonical instance of the trap

The MongoDB Cost of Not Knowing Part 3 case study surfaces this as the new bottleneck after documents were shrunk:

  • appV6R0 (monthly-bucketed, dynamic schema) achieved 125 B average documents vs. appV5R3's 385 B — a 67.5 % shrink. But the approach also tripled the document count (95.3 M vs 33.4 M), and the _id index grew to 3.13 GB, exceeding the 4 GB-RAM machine's 1.5 GB WiredTiger cache allocation. The load test showed the expected disk-throughput win from smaller documents didn't materialize because index pages couldn't stay resident.
  • appV6R1 pivoted to quarterly-bucketing with the same dynamic-schema trick, dropping the _id index to 1.22 GB — back under the cache ceiling. Total per-event size fell from appV5R3's 28.1 B to 20.2 B; load-test throughput improved.

The general lesson: shrinking documents doesn't automatically help if the new bottleneck is index-in-cache. The dominant storage-on-disk dimension has to match the new performance bottleneck's dimension.

Cache-friendly schema heuristics

Drawn from the MongoDB case study and general practice:

  • Right-size bucket width to push index size under cache. Wider buckets = fewer documents = smaller _id index. Narrower buckets = smaller documents = better disk-throughput profile but larger index. Which wins depends on what's saturated.
  • Keep secondary-index cardinality in check. Each secondary index occupies its own cache budget. appV4+ in the case study uses a single _id index for exactly this reason — packing all query predicates into the _id keeps total index pressure tied strictly to document count.
  • Benchmark on representative data volumes. A 1 GB dev collection fits entirely in cache on any modern machine and gives no signal about production behaviour at 100 GB. The case study's 500 M-event / 4 GB-RAM rig is deliberately sized so cache pressure is visible at reasonable test durations.
  • Monitor wiredTiger.cache bytes currently in the cache vs maximum bytes configured, plus eviction statistics in serverStatus output. A rising eviction rate under steady workload is the canonical symptom of exceeding cache.

Seen in

  • sources/2025-10-09-mongodb-cost-of-not-knowing-mongodb-part-3-appv6r0-to-appv6r4 — cache ceiling at 1.5 GB on 4 GB RAM is the specific resource appV6R0's 3.13 GB _id index overflowed, driving the pivot to appV6R1's quarter-bucket schema. Named in-article: "this is near the 4GB of available memory on the machine running the database and exceeds the 1.5GB allocated by WiredTiger for cache … the limiting factor in this case is memory/cache rather than document size."
Last updated · 200 distilled / 1,178 read