Skip to content

CONCEPT Cited by 1 source

Cache stampede

Definition

A cache stampede (also called "cache-miss stampede" or "dogpile") is the specific thundering herd shape that occurs at a shared cache boundary when a popular cached entry goes from present to absent: every concurrent request sees the empty cache, each independently triggers an upstream regeneration, and the origin takes the full fan-in of work that the cache was supposed to absorb.

The Vercel post gives the canonical narrative (sources/2026-04-21-vercel-preventing-the-stampede-request-collapsing-in-the-vercel-cdn):

"Picture a page that just recently expired, or a new route getting hit for the first time. Multiple users request it simultaneously. Each request sees an empty cache and triggers a function invocation. […] Without coordination, each of those misses invokes the function independently. For a popular route, this can mean dozens of simultaneous invocations, all regenerating the same page. This wastes compute and hammers your backend."

Trigger conditions

A cache stampede requires three conditions to align:

  1. A shared cache entry. N clients agree on a cache key and all look it up.
  2. A miss transition. The entry was present and is now absent — TTL expiry, explicit invalidation, cache eviction, restart/deploy wiping the cache, or first-time cache fill (never-populated key).
  3. Concurrent demand. Requests arrive within a window shorter than the regeneration latency — i.e. the race to regenerate is genuinely concurrent.

When all three hold, the number of simultaneous upstream invocations scales with request concurrency, not with unique keys. A single popular key at 1,000 req/s with a 200 ms regeneration latency generates ~200 simultaneous invocations on every expiry.

Variants (by miss transition)

  • TTL expiry stampede. Classic shape. An ISR page with revalidate: 60 expires; all concurrent visitors miss and regenerate. Vercel's request collapsing addresses this.
  • Deploy/restart stampede. The in-memory cache tier is wiped (deploy, crash, flush); every returning client triggers a cold fetch. Figma's LiveGraph pre-100x is the canonical instance (sources/2026-04-21-figma-keeping-it-100x-with-real-time-data-at-scale).
  • New-route stampede. A route goes live or trends for the first time; many concurrent first-requesters miss together. This is the "dozens of simultaneous invocations" case in the Vercel post on a never-populated key.
  • Invalidation stampede. Explicit invalidation of a popular key (e.g. a feature-flag config change broadcast) simultaneously forces all clients to re-fetch. See concepts/cache-ttl-staleness-dilemma for why short-TTL caches regularly tip into this mode.

Why it's worse than generic thundering herd

Cache stampedes have two amplification properties that generic TH does not:

  • Amplification by popularity. The keys most worth caching (hottest keys) are also the ones whose stampedes are worst — miss bursts are exactly where the cache was buying the most.
  • Amplification by cache tier count. A multi-tier cache (L1 in-memory → L2 regional → L3 origin) can have a stampede at any tier's miss boundary. Collapsing at just one tier leaves the others exposed.

Mitigations

  • Request collapsing (singleflight / dogpile prevention). One invocation per {node, region, global} scope; others wait. The direct fix, and Vercel's CDN-altitude answer.
  • Double-checked locking. Ensures the collapsing protocol is correct (waiters can skip their own invocation if the cache was populated during the lock wait).
  • Probabilistic early expiration. Before TTL expires, a small % of requests pre-emptively refresh the cache. Staggers expiry across requests rather than concentrating it at a single instant.
  • Stale-while-revalidate (SWR). Serve stale cached content to all callers while one background invocation refreshes. Avoids the miss-window entirely at the cost of freshness. Vercel's 90M collapsed/day on background revalidations is the SWR path; 3M on cold miss is the collapsing path — they solve adjacent problems.
  • Jittered TTL. Randomise TTL per key or per client so large populations don't expire in lock-step. See the cron-alignment case of concepts/thundering-herd.
  • Warm-up at the miss source. Pre-populate the cache before a known miss-causing event (deploy, cold region).

Canonical production instance

Vercel CDN (April 2026) solves cache stampedes on ISR routes via per-region request collapsing: 3M+/day on cache miss + 90M+/day on background revalidation. "Only one invocation per region runs, and every other request waits briefly to receive that same response once it is cached." (sources/2026-04-21-vercel-preventing-the-stampede-request-collapsing-in-the-vercel-cdn)

Seen in

Last updated · 476 distilled / 1,218 read