Skip to content

CONCEPT Cited by 1 source

Two-level distributed lock

Definition

A two-level distributed lock is a lock hierarchy in which contending workers must acquire a local lock first, then a shared / global lock — so that the number of contenders at the shared-lock altitude is bounded by the number of local-lock winners, not by total contender count.

The shape:

N concurrent requests for key K
┌─────────────────┐
│ Per-node lock K │  ← funnel: N requests → 1 winner per node
└─────────────────┘
┌───────────────────┐
│ Per-region lock K │  ← serialise: M nodes → 1 global winner
└───────────────────┘
 do the work

Without the first level, all N requests contend directly for the shared lock, and the lock itself becomes a thundering-herd bottleneck on popular keys. With the first level, only M requests reach the shared lock (where M = number of nodes), regardless of N (total requests). For typical CDN scale, M is tens-to-hundreds while N can be thousands-to-millions.

Why the first level is load-bearing

The core intuition is given directly in the Vercel post:

"Without the node-level grouping, hundreds of concurrent requests could all compete for the regional lock simultaneously. This would create a thundering herd problem where the lock coordination itself becomes a bottleneck." (Source: sources/2026-04-21-vercel-preventing-the-stampede-request-collapsing-in-the-vercel-cdn)

I.e. the fix for a cache stampede (see concepts/cache-stampede) would itself become a lock- acquisition stampede if the lock had only one level. The local level funnels contention down to O(nodes), which is bounded by cluster topology rather than traffic — so the shared lock sees a predictable and small contender count even under arbitrary request load.

Formal statement:

  • Level 1 (local): waiters per lock = O(concurrent requests on this node for this key).
  • Level 2 (shared): waiters per lock = O(nodes in region with concurrent demand for this key).
  • Total upstream work: O(1 per region).

Instance vs partition locking

The two-level idea is distinct from partitioned locking (a lock sharded by key hash). Both reduce contention, but:

  • Partitioned locking reduces contention across keys: different keys hash to different locks, so unrelated workloads don't contend. It does nothing for same-key contention.
  • Two-level locking reduces contention for the same key: same-key contenders funnel through the local level so the shared level sees O(nodes) contenders instead of O(requests).

The two are complementary: you partition the global lock by key hash to get cross-key parallelism, and you add a local level so same-key same-node contention doesn't spill directly to the global layer.

Production shape: Vercel CDN

(sources/2026-04-21-vercel-preventing-the-stampede-request-collapsing-in-the-vercel-cdn)

  • Level 1 — node lock. In-memory, per-server-instance lock keyed by cache key. Multiple workers inside one node serialise against the node lock; only one worker per node can proceed.
  • Level 2 — regional lock. Distributed lock shared across all nodes in the region, keyed by cache key. One node-winner per node contends for the regional lock; only one node wins regionally and proceeds to invoke the upstream function.
  • Scope boundary. One invocation per region per cache miss. Global consistency across regions is not enforced (accepts N regional invocations on a truly global cold miss).

The code sketch in the post:

function createDistributedLock(cacheKey) {
  const nodeLock     = createNodeLock(cacheKey,     { timeout: 3000 });
  const regionalLock = createRegionalLock(cacheKey, { timeout: 3000 });
  return combineLocks([nodeLock, regionalLock]);
}

Note both levels get explicit timeouts — a lost regional lock doesn't block the whole region; a lost node lock doesn't block all node workers.

Design parameters

  • Fanout ratio. L1 fanin (requests per node) can be thousands; L2 fanin (nodes per region) is typically tens-to-hundreds. The total fanout collapsing buys is L1 × L2.
  • Scope choice. Why not three levels (worker → node → region → global)? Each additional level adds coordination cost and a new failure mode. Two is the sweet spot for most CDN architectures where regions are the natural blast-radius boundary.
  • Lock implementations. L1 is cheap (in-process mutex or in-memory hash); L2 must be distributed and fault-tolerant. Vercel's post doesn't name the L2 substrate, but the interface (createRegionalLock, { timeout: 3000 }) is generic.
  • Timeout policy. Both levels time out — if L2 can't be acquired within the window, the waiter abandons and invokes directly. See concepts/lock-timeout-hedging.

Alternatives considered

  • One-level (regional only). Simplest. N requests per region all contend on one lock. The lock protocol itself becomes a TH-like bottleneck on popular keys.
  • One-level (node only). Cheap. But doesn't prevent cross-node duplicate invocations — N nodes each invoke on the same key simultaneously.
  • Global single lock. One invocation worldwide per miss. But requires cross-region consensus (slow), and a regional lock failure becomes a global outage.
  • Fanout tree (3+ levels). Worker → node → zone → region → global. Diminishing returns on contention reduction, increasing cost on lock coordination. Rarely seen.

Seen in

Last updated · 476 distilled / 1,218 read