CONCEPT Cited by 1 source
Exponential TTL¶
Definition¶
An exponential TTL (or more generally an age-based TTL ladder) is a cache-expiry strategy where the TTL assigned to a cached value increases monotonically with the age of the underlying data — fresh data gets a short TTL, old data gets a long TTL.
The "exponential" label comes from the canonical Netflix schedule — TTL doubles for each additional minute of data age — but the general idea applies with any monotonically-increasing schedule (linear, piecewise, logarithmic).
Why it exists¶
In time-series systems subject to late-arriving data, the trustworthiness of a value varies with its age:
- Recent data (last few minutes) may still change as delayed events trickle in. Confidence is low. Cache too long → wrong chart.
- Older data (30+ minutes old) is effectively final. Confidence is high. Cache it aggressively; re-querying wastes backend load.
A uniform TTL forces a single trade-off for data with very different confidence levels. An exponential / age-based ladder resolves the forced either/or of the cache TTL staleness dilemma — different buckets get different TTLs based on how much staleness is actually acceptable for that bucket.
Canonical instance: Netflix Druid interval cache¶
Netflix's Druid cache assigns per-bucket TTLs by age (Source: sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid):
| Bucket age | TTL |
|---|---|
| < 2 min | 5 s (minimum) |
| 2 min | 10 s |
| 3 min | 20 s |
| 4 min | 40 s |
| 5 min | 80 s |
| n min | roughly 5 · 2^(n−1) s |
| (cap) | 1 hour (maximum) |
- Fresh data cycles rapidly → late-arriving corrections picked up quickly
- Old data lingers → confidence grows with time, cache hit rate climbs for the bulk of the query window
For a 3-hour rolling-window query at 1-minute granularity, this ladder ensures the vast majority of the query is served from cache — only the newest few minutes hit Druid.
Why "exponential" is a good fit¶
- The confidence-in-bucket-value function is roughly exponential in age for most event pipelines (late events are Pareto-distributed in arrival delay).
- Doubling gives the freshest bucket a short TTL + reaches the cap in a small number of steps, which bounds both the staleness risk and the TTL-ladder table size.
Netflix's schedule caps at 1 hour rather than growing indefinitely; the cap is a practical bound rather than a correctness bound.
Contrast with uniform TTL and event-driven invalidation¶
| Strategy | When it fits |
|---|---|
| Uniform TTL | Stable data, homogeneous confidence |
| Age-based exponential TTL | Time-series data with late-arriving events; confidence grows with age |
| Invalidation-based | Source can push invalidation messages (CDC, pub/sub) |
| Write-through | Writes are the authoritative update path |
Time-series + late-arriving-data is the exact shape where uniform TTL forces a bad choice (long TTL = stale; short TTL = load) and invalidation is impractical (no per-value change events; late arrivals retroactively adjust many buckets).
Seen in¶
- sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid — canonical wiki instance at Netflix.
Related¶
- concepts/late-arriving-data — the forcing function.
- concepts/staleness-vs-load-tradeoff — the meta-trade-off that exponential TTLs navigate.
- concepts/granularity-aligned-bucket — the bucket-shape that TTLs are applied to.
- concepts/rolling-window-query — the workload pattern.
- concepts/cache-ttl-staleness-dilemma — the forced-either-or that exponential TTLs escape from.
- systems/netflix-druid-interval-cache
- patterns/age-based-exponential-ttl
- patterns/interval-aware-query-cache