Skip to content

CONCEPT Cited by 1 source

DNS resolver caching

Recursive DNS resolvers (such as Unbound) cache the results of DNS queries with TTL-bounded lifetimes so repeated lookups for the same name don't hit the upstream on every call. Caching applies both to successful answers (positive caching) and to NXDOMAIN / SERVFAIL-style failures (negative caching, usually with shorter TTLs) to avoid repeatedly hammering a nameserver that just failed.

Design choices

  • Per-host local resolver. Running a caching resolver on every application host (instead of only on central DNS servers) provides a first-line cache that filters out the repeated-same-name load before it reaches the shared infrastructure. Canonical Stripe deployment shape from the 2024-12-12 source.
  • Cache-size and TTL tuning. TTLs come from the authoritative nameserver's response; resolvers can also apply min-ttl / max-ttl to force refresh cadence, and cache-max-negative-ttl to cap how long failures stay cached.
  • Cache effectiveness is workload-dependent. A reverse-DNS-heavy workload over a large IP set (e.g. every unique IP in a day's web-access logs) is effectively cache-cold — each lookup is a different name, so the cache hit rate stays near zero no matter the cache size.

Seen in

  • Stripe — The secret life of DNS packets (2024-12-12). Stripe runs Unbound on every host as a local caching tier above the central DNS server cluster. During the incident, the Hadoop job's reverse-lookup workload was cache-cold (each IP unique), so caching provided no mitigation and the load landed on the central cluster and then the VPC resolver.
Last updated · 470 distilled / 1,213 read