CONCEPT Cited by 1 source
Cache for Availability¶
Cache for availability is the framing that treats a cache not primarily as a latency or bandwidth optimisation, but as a way to decouple your uptime from a dependency's uptime. A cache hit during a downstream outage is an availability win, not a performance win.
The architectural shift: caches that you would otherwise provision for p50 / p99 latency are the same physical caches that give you outage survivability. Recognising the dual purpose lets you make explicit design choices — TTL length, cache population strategy, what to serve when stale — that a performance-only framing wouldn't surface.
Two shapes¶
Short-TTL cache in front of hot lookup¶
Absorbs dependency blips. Cache TTL is tuned so that a seconds-long outage of the dependency is invisible to requests that hit the cache. See patterns/cached-lookup-with-short-ttl for the canonical wiki pattern.
Trade-off: you accept "data up to TTL seconds stale" as the freshness floor, in exchange for tolerating TTL-seconds of downstream unavailability without impact.
CDN / edge cache in front of origin¶
Absorbs origin-tier outages. The edge holds previously-served responses; a total origin outage still lets cached URLs serve. See patterns/cdn-in-front-for-availability-fallback for the canonical wiki pattern.
Trade-off: the property only holds for previously-cached
content + cacheable status codes (usually 200 only).
Freshly-published content + error paths aren't covered.
Design choices this framing surfaces¶
- Stale-while-revalidate / stale-if-error — explicit policy to serve stale on upstream error instead of propagating the error. HTTP cache-control directives express this directly.
- Cache negative responses? — 404s, 5xxs. Usually no, because caching 404s defeats freshly-published content; but worth making explicit.
- TTL tuning — shorter TTL = fresher data + shorter outage tolerance. Longer TTL = more stale data + longer outage tolerance.
- Purge discipline — if the cache is for availability, you care about purge not going through the same dependency you're insulating from.
Seen in¶
- sources/2025-09-02-github-rearchitecting-github-pages — GitHub Pages's 2015 rewrite uses caches at two layers precisely for availability: (a) a 30 s nginx shared-memory cache on routing lookups absorbs MySQL blips (patterns/cached-lookup-with-short-ttl), and (b) Fastly in front of the whole origin caches every 200 so cached sites survive a total router outage (patterns/cdn-in-front-for-availability-fallback). The post frames both explicitly as availability moves, not just latency.
Related¶
- concepts/availability-dependency — the upstream failure mode being mitigated.
- patterns/cached-lookup-with-short-ttl — shape 1.
- patterns/cdn-in-front-for-availability-fallback — shape 2.
- systems/fastly — canonical CDN.