Skip to content

SYSTEM Cited by 1 source

Vercel CDN

Definition

Vercel CDN is Vercel's globally distributed edge-cache and request-serving tier for every deployment. It sits between the public internet and Vercel's compute / storage tiers, handling cache lookup, request collapsing, static asset delivery, geographic routing, and on-miss dispatch to Vercel Functions / Edge Functions.

Conceptually adjacent to — but distinct from — systems/vercel-routing-service: the routing service decides which backend a request should hit (static asset, edge function, Vercel function, or 404); the CDN handles cache semantics, stampede prevention, and response delivery. In practice the two run together at the edge, but the architectural roles are separable.

Cache hierarchy

(sources/2026-04-21-vercel-preventing-the-stampede-request-collapsing-in-the-vercel-cdn)

The CDN exposes three cache tiers, queried hot-to-cold:

  1. Node in-memory cache. Per-server-instance, small, for frequently-requested content. Served immediately on hit.
  2. Regional CDN cache (Vercel cache). Shared across all nodes in a region; replicated from the ISR cache. One tier deeper than the node cache.
  3. ISR cache. Global source of truth co-located with the functions. Stores regeneration outputs. Replicates to each region's regional cache on a background schedule.

On a full miss at all three tiers, a function invocation (Edge Function or Vercel Function) regenerates the content and populates the cache.

Request collapsing

The CDN implements per-region request collapsing as a default behaviour for every cacheable route. Collapsing prevents cache stampedes (concepts/cache-stampede): when many concurrent requests for the same uncached path arrive, the CDN ensures that only one function invocation per region runs to regenerate the page, and all other requests wait briefly to receive the cached result.

The mechanism:

  • Two-level lock. See concepts/two-level-distributed-lock. Each CDN node maintains an in-memory lock per cache key; the region maintains a distributed lock across all nodes. Requests must acquire both to regenerate. The node lock funnels per-node contention to ≤1 per node; the regional lock serialises across nodes.
  • Double-checked locking. See concepts/double-checked-locking. The protocol checks the cache both before acquiring the lock (fast path) and after (to skip redundant regeneration when another waiter populated it first).
  • Lock timeouts. See concepts/lock-timeout-hedging. Both lock levels have a 3,000 ms timeout; waiters that can't acquire within the window abandon waiting and invoke independently, trading extra work for bounded tail latency.

Cache-write is asynchronous after the function returns (so TTFB is not blocked on the cache set), and the lock is released as soon as the cache is populated (so waiters can proceed quickly, before the response has finished streaming to the original requester).

Cacheability: framework-inferred

Not every route is collapsible — e.g. dynamic API routes that return user-specific data, or routes with random content. The CDN distinguishes these via framework-inferred cache policy:

"When you deploy your app, Vercel analyses your routes and understands which ones use ISR, static generation, or dynamic rendering. This metadata gets distributed to every CDN region. When a request arrives, the CDN already knows whether that specific route can be safely cached and collapsed."

Three distinguished cacheability classes:

Route type Cacheable? Collapsible?
ISR page (same content per user)
SSG page (build-time static) n/a (no regeneration)
Dynamic API route (per-user data)
Page with random content / timestamps

No user configuration; the Next.js build classifies each route.

Production numbers

(April 2026, as disclosed by the request-collapsing post.)

  • 3M+ requests/day collapsed on cache miss.
  • 90M+ requests/day collapsed on background revalidation (~30× the on-miss volume).
  • ~93M total/day collapsed.
  • Observed burst: 30 rps collapsed → 120 rps collapsed in a short window (4× spike, showing how quickly collapse benefit scales).
  • Adoption: 100% of Vercel projects using ISR (zero-config default).

The ratio of background-revalidation collapsing to cold-miss collapsing (~30:1) indicates the dominant cache lifecycle on Vercel: most traffic sees the cache-hit path; collapsing at the background-revalidation boundary is the larger volume by an order of magnitude.

Architectural role

Client ──► DNS ──► Edge POP ──► Routing Service ──► Vercel CDN ──┬─► Node in-memory cache (hit → return)
                                                                 ├─► Regional cache (hit → return, populate node)
                                                                 ├─► ISR cache (hit → return, populate regional)
                                                                 └─► Function invocation (miss → regenerate,
                                                                                          collapse via two-level lock)

The CDN is layered between the routing service (path- existence / which-backend decision) and the compute tier (function invocation). Every request to a deployed Vercel site passes through it.

Interaction with sibling systems

  • Next.js — supplies per-route cache semantics via build-output metadata. ISR revalidation windows, SSG/ISR/SSR classification, and dynamic-route detection all flow from Next.js build to the CDN.
  • systems/vercel-routing-service — runs in front of the CDN for path-existence and backend-selection decisions. The CDN handles everything after a real path is confirmed. Both consume deployment metadata at the edge (routing service: Bloom filter of paths; CDN: per-route cache semantics).
  • systems/vercel-functions / systems/vercel-edge-functions — the compute tier invoked on cache miss to regenerate. Collapsing directly bounds the invocation rate per region on hot keys.
  • systems/vercel-fluid-compute — pricing / compute model for the functions. Collapsing reduces active-CPU time by avoiding duplicate regenerations — direct cost reduction under Fluid Compute's active-CPU billing.

Caveats (undocumented aspects)

  • Lock substrate not named. The regional distributed lock's implementation (Redis, purpose-built, something else) is not disclosed.
  • Global cold-miss behaviour under-specified. One invocation per region means N regions → N invocations on a truly global cold miss. The post doesn't discuss any cross-region coalescing.
  • Negative-caching policy. Errors aren't cached, so repeated errors retry the function repeatedly. No circuit-breaker mentioned.
  • Latency distributions undisclosed. "Briefly" (collapsed waiter wait time) is not pinned to a p99 number.
  • Cache-key derivation. The post shows getCacheKey(request) as a function but doesn't discuss what goes into the key (URL, host, vary headers, cookies?).
  • Interaction with stale-while-revalidate. The 90M/day background-revalidation-collapsed figure is large; the post implies SWR semantics apply (stale served while background refresh runs) but doesn't explicitly describe the state machine.

Seen in

Last updated · 476 distilled / 1,218 read