PATTERN Cited by 1 source
Cached Lookup with Short TTL¶
Shape¶
Cache the result of an expensive or availability-coupled lookup (DB query, RPC, DNS, auth token validation) in local in-process memory with a short TTL — seconds, not minutes. Every request first checks the cache; on hit, skip the downstream call entirely; on miss or TTL-expiry, do the lookup and repopulate.
The TTL is tuned so that:
- Hot lookups hit the cache the overwhelming majority of the time (= dominant bandwidth offload + latency reduction).
- Data freshness is acceptable — users tolerate "data up to TTL seconds stale".
- Downstream outages shorter than TTL are invisible to cache-hit traffic.
When to use¶
- Lookup is on a hot code path (per-request or near-per-request).
- Lookup target is a soft-freshness store — the latest value matters within seconds, not milliseconds.
- Downstream has latency variability or transient unavailability that you want to insulate callers from.
- Cache memory cost is bounded (key cardinality × value size fits in process memory).
When NOT to use¶
- Strict freshness required (auth session revocation, inventory on a transactional path).
- Unbounded key cardinality — the cache evicts under load and you get pathological miss rates.
- Stale reads have security / correctness implications that the TTL-bounded window can't tolerate.
Design decisions¶
- TTL length — trade freshness vs. dependency-insulation window. 30 s is a common sweet spot for "absorb blips, stay fresh". Sub-second TTLs are performance caches, not availability caches. Multi-minute TTLs are usually a freshness mistake.
- Cache scope — per-process (each nginx worker has its own) vs. per-host (shared-memory zone across workers). Per-host gives better hit ratios; per-process is simpler.
- Miss behaviour — block all misses on the downstream call (thundering herd on TTL expiry), or serve stale + refresh async (stale-while-revalidate).
- Negative caching — cache misses / errors too? Usually a much shorter TTL, to avoid "dependency came back but we're still serving errors".
- Invalidation — short TTL is itself the invalidation mechanism. If you need faster, you need explicit purge — at which point re-evaluate whether this pattern fits.
Canonical wiki instance¶
GitHub Pages's 2015 rearchitecture.
Per request, the ngx_lua router needs the
hostname → fileserver mapping from MySQL. To
reduce DB load and tolerate MySQL blips, routing lookups are
cached in nginx's shared memory zones for 30 seconds on
each pages-fe node. Per the source: "We also use ngx_lua's
shared memory zones to cache routing lookups on the pages-fe
node for 30 seconds to reduce load on our MySQL infrastructure
and also allow us to tolerate blips a little better."
Source: sources/2025-09-02-github-rearchitecting-github-pages.
The 30 s TTL is the operating point for: (a) new Pages sites publish within 30 s (tolerable vs. the pre-2015 30-minute cron regen), (b) MySQL blips under 30 s are invisible to Pages routing, (c) per-node memory footprint manageable at the routing-table cardinality.
Trade-offs vs. alternatives¶
- vs. no cache — wins bandwidth + latency + outage- tolerance; costs freshness within TTL + cache memory.
- vs. push-based invalidation — simpler (no invalidation fan-out path); loses immediate propagation.
- vs. long-TTL cache — tighter freshness floor; less outage tolerance window.
- vs. CDN edge cache (patterns/cdn-in-front-for-availability-fallback) — complementary, not competing. CDN handles full-origin outage; short-TTL handles transient downstream flakiness.
Related¶
- systems/nginx, systems/ngx-lua — canonical substrate;
shared memory zonesare the built-in primitive. - systems/github-pages — canonical wiki instance.
- concepts/availability-dependency — the cost the pattern mitigates.
- concepts/cache-for-availability — the framing.
- patterns/db-routed-request-proxy — the partner pattern that creates the need for this one.
- patterns/cdn-in-front-for-availability-fallback — complementary outer-layer cache for full-origin outages.