CONCEPT Cited by 1 source

Negative caching¶

Definition¶

Negative caching is caching empty / no-result / absence-of-data responses so the cache doesn't re-query the backend every time for a legitimately-empty answer. Without it, a cache that only stores positive responses treats "no data" identically to "we haven't fetched it yet" — and re-queries every time, amplifying backend load exactly on the case where there's nothing to return.

Why it matters¶

Natural sources of absence:

Sparse time-series metrics — metric only emits events occasionally; most time buckets have zero data.
DNS NXDOMAIN — classic negative-caching case. DNS resolvers cache NXDOMAIN responses with the SOA's minimum TTL so a nonexistent hostname isn't re-queried through the full resolver chain every second.
Authz miss — "this principal has no grants on this resource" is an answer worth caching.
Search "no hits" — empty result sets from a search backend.

The cost of not negative-caching scales with how often the absence is queried. For a sparse metric refreshed on a dashboard, the absence is queried every refresh — so the cost is continuous.

Netflix's implementation¶

Netflix's Druid cache uses empty sentinel values for time buckets where Druid returned no data. The gap-detection logic recognises the sentinel as "valid cached data, absent by design" rather than "missing, please refetch" (Source: sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid).

Netflix calls out a subtle asymmetry: trailing empty buckets are deliberately not negative-cached. The reasoning:

If a query returns data up to minute 45 and nothing after, the trailing buckets for minutes 46+ might be empty because events haven't arrived yet, not because the bucket is truly empty.
Caching "no data" for a pending trailing bucket would exacerbate late-arrival chart delays — subsequent queries would see the cached "no data" even after the real events arrive.

Rule (Source: sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid):

"We only cache empty entries for gaps between data points, not after the last one."

This is a nice illustration that negative caching is not unconditional — it interacts with the late-arrival model of the underlying data source.

Anti-patterns¶

Negative-caching trailing buckets in a streaming pipeline (Netflix's specific warning).
Unbounded-TTL negative cache for data that might come into existence later (e.g. a new DNS record being added, a user gaining a new permission).
Negative cache that doesn't distinguish "not found" from "backend unavailable" — caching an error as absence propagates outages into user-facing empty states.

Relationship to hot-key mitigation¶

Negative caching also mitigates a common hot-key pathology: a popular-but-empty query (e.g. querying a feature flag that's disabled for everyone) would otherwise pound the backend.

Seen in¶

sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid — Netflix's Druid cache + the trailing-bucket exception.
concepts/hot-key already referenced "client-side caching with TTL + negative-cache on miss" as a mitigation.
DNS (not-yet-distilled on the wiki) — classic negative-caching case, RFC 2308.

concepts/late-arriving-data — the reason the trailing-bucket exception exists.
concepts/granularity-aligned-bucket — the unit per-which negative entries are stored in the Netflix case.
concepts/rolling-window-query — the workload shape.
systems/netflix-druid-interval-cache
patterns/interval-aware-query-cache