PATTERN Cited by 1 source

Interval-aware query cache¶

Pattern¶

When serving rolling-window time-series queries at scale, don't cache query responses as opaque blobs keyed on (query, interval) — they miss every time the interval shifts. Instead:

Parse the query to extract the time interval, granularity, and structure.
Hash the query shape (datasource, filters, aggregations, granularity, structural context) — excluding the time interval and volatile context — to get a stable shape-only outer cache key.
Decompose the response into granularity-aligned time buckets — one cache entry per bucket — keyed on the bucket timestamp encoded so lexicographic order matches chronological order (big-endian bytes).
Assign age-based TTLs per bucket — fresh buckets get short TTLs (handle late-arrival corrections), old buckets get long TTLs (they're settled).
On read, scan cached buckets in contiguous prefix from the interval start. First gap → stop and issue one narrowed backend query for the missing tail (patterns/partial-cache-hit-with-tail-fetch).
Recompose: concatenate cached prefix + fresh tail, sorted by timestamp. From the caller's perspective, the shape is identical to a direct backend response.
Write back asynchronously — the fresh buckets are parsed and stored in the cache after the response is sent.

When to use¶

Workload is dominated by rolling- window time-series queries ([now - Δ, now]) that refresh frequently.
Backend (Druid, Prometheus, InfluxDB, M3, VictoriaMetrics, Pinot, etc.) has expensive query capacity relative to storage.
Consumers are dashboards / monitoring / alerting that can tolerate seconds of staleness (formalised as a staleness-vs-load trade-off).
Many concurrent consumers view near-identical data at the same time (live shows, global launches, incidents, marketing events).

When not to use¶

Workload is ad-hoc analytics — each query is one-off; no reuse.
Consumers need point-in-time consistency (trading, order matching).
Query granularity is sub-second and windows are long — bucket count explodes.
Backend already has an effective full-result cache and windows don't shift (fixed-interval reports).

Netflix's canonical implementation¶

Netflix's Druid interval cache is the canonical wiki instance (Source: sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid). Architectural shape:

Outer key: SHA-256 over the Druid timeseries / groupBy query with interval + volatile context stripped.
Inner keys: big-endian bucket-start timestamps at query granularity (1-min minimum).
TTL ladder: 5 s (< 2 min old) doubling per additional minute, capped at 1 hour.
Storage: KVDAL two-level-map on Cassandra with per-inner-key TTLs + native range scans.
Deployment: external service behind the Druid Router via intercepting proxy.
Negative caching for interior empty buckets; not for trailing empty buckets (they might be unsettled).

Production results (typical day): 82% of queries get at least a partial hit, 84% of result data served from cache, cache P90 ~5.5 ms; A/B experiment showed ~33% drop in Druid queries, ~66% P90 improvement, up to 14× result-bytes reduction.

Why it works (mechanism)¶

Shift invariance: overlapping windows share buckets. A 3-hour-window query that's refreshed 30 s later reuses 179 of 180 buckets.
Backend load bounded: at most one backend fetch per request, narrowed to the uncached tail — so the per-query backend cost drops roughly proportional to the cache-hit rate.
Concurrency amplification: N concurrent viewers of the same dashboard emit N requests, but only the first request on a cold bucket misses; subsequent requests within the TTL hit cache. Netflix quote: "the query rate reaching Druid is essentially the same with 1 viewer or 100."

Generalisation¶

Netflix explicitly flags the pattern as non-Druid-specific: it "could apply to any time-series database with frequent overlapping-window queries" (Source: sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid). Candidate backends: Prometheus, InfluxDB, M3DB, VictoriaMetrics, TimescaleDB, Apache Pinot, ClickHouse time-series workloads, BigQuery over time-partitioned tables.

The required backend capabilities are modest:

Query parseable into (shape, interval, granularity).
Responses decomposable by time bucket.
Support for narrowed re-queries (any time-series backend supports this).

The required cache-layer capabilities:

Two-level map with per-inner-key TTLs.
Range scans on inner keys.

(KVDAL has both; Redis with hash-fields + per-field TTLs has them partially; Cassandra directly has them; any system with these primitives can host the cache.)

Forward direction¶

Netflix frames the external-proxy shape as a workaround: the long-term direction is upstreaming the capability into the backend itself — in Druid's case, an opt-in result-level cache in the Brokers with metrics to tune TTLs. This avoids the extra network hops, infra overhead, and SQL-templating workarounds.

Seen in¶

sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid — canonical instance at Netflix for Apache Druid.

concepts/rolling-window-query — the workload pattern.
concepts/granularity-aligned-bucket — the decomposition unit.
concepts/exponential-ttl — the TTL model.
concepts/negative-caching — empty-bucket handling.
concepts/query-structure-aware-caching — the key-normalisation property.
concepts/time-series-bucketing — the general bucketing framing.
concepts/late-arriving-data — the forcing function.
concepts/staleness-vs-load-tradeoff — the declared trade-off.
systems/netflix-druid-interval-cache
systems/apache-druid
patterns/age-based-exponential-ttl — sub-pattern.
patterns/partial-cache-hit-with-tail-fetch — the hit-logic sub-pattern.
patterns/intercepting-proxy-for-transparent-cache — deployment shape.