CONCEPT Cited by 1 source

Cache for Availability¶

Cache for availability is the framing that treats a cache not primarily as a latency or bandwidth optimisation, but as a way to decouple your uptime from a dependency's uptime. A cache hit during a downstream outage is an availability win, not a performance win.

The architectural shift: caches that you would otherwise provision for p50 / p99 latency are the same physical caches that give you outage survivability. Recognising the dual purpose lets you make explicit design choices — TTL length, cache population strategy, what to serve when stale — that a performance-only framing wouldn't surface.

Two shapes¶

Short-TTL cache in front of hot lookup¶

Absorbs dependency blips. Cache TTL is tuned so that a seconds-long outage of the dependency is invisible to requests that hit the cache. See patterns/cached-lookup-with-short-ttl for the canonical wiki pattern.

Trade-off: you accept "data up to TTL seconds stale" as the freshness floor, in exchange for tolerating TTL-seconds of downstream unavailability without impact.

CDN / edge cache in front of origin¶

Absorbs origin-tier outages. The edge holds previously-served responses; a total origin outage still lets cached URLs serve. See patterns/cdn-in-front-for-availability-fallback for the canonical wiki pattern.

Trade-off: the property only holds for previously-cached content + cacheable status codes (usually 200 only). Freshly-published content + error paths aren't covered.

Design choices this framing surfaces¶

Stale-while-revalidate / stale-if-error — explicit policy to serve stale on upstream error instead of propagating the error. HTTP cache-control directives express this directly.
Cache negative responses? — 404s, 5xxs. Usually no, because caching 404s defeats freshly-published content; but worth making explicit.
TTL tuning — shorter TTL = fresher data + shorter outage tolerance. Longer TTL = more stale data + longer outage tolerance.
Purge discipline — if the cache is for availability, you care about purge not going through the same dependency you're insulating from.

Seen in¶

sources/2025-09-02-github-rearchitecting-github-pages — GitHub Pages's 2015 rewrite uses caches at two layers precisely for availability: (a) a 30 s nginx shared-memory cache on routing lookups absorbs MySQL blips (patterns/cached-lookup-with-short-ttl), and (b) Fastly in front of the whole origin caches every 200 so cached sites survive a total router outage (patterns/cdn-in-front-for-availability-fallback). The post frames both explicitly as availability moves, not just latency.

concepts/availability-dependency — the upstream failure mode being mitigated.
patterns/cached-lookup-with-short-ttl — shape 1.
patterns/cdn-in-front-for-availability-fallback — shape 2.
systems/fastly — canonical CDN.