CONCEPT Cited by 1 source
Double-checked locking¶
Definition¶
Double-checked locking is a concurrency idiom that avoids duplicating expensive work when multiple threads / requests / processes compete to do the same thing. The structure is:
- First check — read the shared state. If it already holds a valid value, return immediately. No lock taken, no expense paid.
- Acquire lock. Serialise with other potential workers.
- Second check — re-read the shared state under the lock. Another worker may have populated it while this one was waiting for the lock.
- Do the work — only if the second check still shows no value. Store the result, then release the lock.
The idiom is named for the two state checks around the lock. It is the standard answer to "serialising workers is necessary for correctness, but waking up to find the work already done is also necessary for throughput."
Why two checks?¶
The naive one-check form (check → lock → do work) is correct but redundant: every waiter entering the lock does the work, even if the first one already completed it. The improved two-check form (check → lock → check → work) skips the work when a prior waiter finished it during the lock wait.
Without the second check, a collapsing implementation still serialises callers but does not collapse work — each waiter invokes after acquiring the lock. Vercel explicitly frames this:
"Locking alone doesn't collapse requests. If every waiter invoked the function after getting the lock, work would still be duplicated." (Source: sources/2026-04-21-vercel-preventing-the-stampede-request-collapsing-in-the-vercel-cdn)
The second check is therefore the deduplication step — it's what makes collapsing actually collapse.
Pseudo-code¶
async function respond(request) {
const cacheKey = getCacheKey(request);
// First check — fast path on cache hit
const cached1 = await cache.get(cacheKey);
if (cached1) return cached1;
// Acquire distributed lock keyed by cacheKey
const lock = createDistributedLock(cacheKey);
await lock.acquire();
try {
// Second check — someone else may have populated while we waited
const cached2 = await cache.get(cacheKey);
if (cached2) return cached2;
// Do the work — only path that actually regenerates
const response = await invokeFunction(request);
cache.set(cacheKey, response); // fire-and-forget async
return response;
} finally {
lock.release();
}
}
Source: adapted from Vercel's request-collapsing post (sources/2026-04-21-vercel-preventing-the-stampede-request-collapsing-in-the-vercel-cdn).
Historical context¶
The idiom dates to lazy-singleton initialisation in C++ and Java
— "check-then-initialise-then-publish". The classic Java form
had a famous bug pre-JDK 5: without volatile, the store of the
reference could be reordered such that another thread saw a
non-null but partially-constructed object. The Java memory model
fix (volatile + publication guarantees) is the concurrent-
programming-textbook lesson; Doug Lea's JSR-133 and "Java
Concurrency in Practice" cover it in detail.
At the distributed altitude, the analogous hazard is cache
write visibility: if the cache set isn't synchronously visible
before lock release, a waiter could acquire the lock, re-check
the cache, see a stale miss, and redundantly invoke. Vercel's
implementation sidesteps this by holding the lock until the
cache is populated ("the lock is released as soon as the cache
is populated, so waiters can proceed quickly") — cache write
happens-before lock release, waiters always see the populated
value.
At the distributed-cache altitude¶
Adapting DCL from shared memory to distributed caches requires three extra considerations:
- Lock substrate. The lock must be distributed (Redis, ZooKeeper, purpose-built). The Vercel post doesn't name the substrate but the shape is clear: per-key, per-region, timeout-bounded.
- Cache-write ordering vs lock release. The cache write must be visible to other lock acquirers before they re-check. Easiest: populate cache first, release lock second. Vercel does this (lock released "as soon as the cache is populated").
- Lock timeouts. A distributed lock can be lost (network partition, holder crash). DCL must tolerate the second check seeing an empty cache for legitimate reasons (prior holder crashed mid-invocation) and proceed to invoke; combined with lock timeout hedging, this degrades gracefully instead of piling up indefinitely.
Common mistakes¶
- Omitting the second check. The most common implementation bug. Yields a correct serialising lock but not a deduplicating one — every waiter still does the work.
- Caching errors. If the first invocation fails, the cache stays empty; the next lock acquirer's second check still shows empty; it retries. This is desirable for transient errors, but a permanent failure (e.g. a 4xx from a broken route) can endlessly retry. Explicit negative caching can help here — but note the Vercel post explicitly chooses not to cache errors.
- Reading from cache before all writes settle. If the cache is write-behind, the second check might race with a concurrent set. Strongly-ordered caches or synchronous cache writes required.
- Long-lived lock. Holding the lock across the full response send (not just the cache populate) delays waiter progress. Vercel explicitly doesn't do this: the response is sent without waiting for the cache set to complete, but the lock is released at cache-populated. Two different "as soon as possible" release points for two different latency goals.
Canonical production instance¶
Vercel CDN request collapsing (April 2026) uses DCL at the per-region distributed-lock altitude, as the correctness layer on top of the two-level lock. The first check is the cache lookup before lock acquisition; the second check is after lock acquisition; cache population triggers lock release. (sources/2026-04-21-vercel-preventing-the-stampede-request-collapsing-in-the-vercel-cdn)
Seen in¶
- sources/2026-04-21-vercel-preventing-the-stampede-request-collapsing-in-the-vercel-cdn — canonical CDN-altitude application; the post explicitly names "the well-known double-checked locking pattern" and gives pseudo-code.
Related¶
- concepts/request-collapsing — the primary use case for DCL at the cache/CDN altitude.
- concepts/cache-stampede — the failure collapsing (with DCL) prevents.
- concepts/two-level-distributed-lock — the scalability structure DCL layers over.
- concepts/lock-timeout-hedging — the complementary failure-mode policy.