CONCEPT Cited by 1 source
Query-structure-aware caching¶
Definition¶
Query-structure-aware caching is caching where the cache parses the incoming query, understands its logical structure, and decomposes the response along a structural axis — rather than treating the query string as an opaque key and the response as an opaque blob.
Concretely, the cache can:
- Identify what the query is asking — extract the datasource, filters, aggregations, granularity, time interval.
- Normalise / hash the cacheable part — separate the stable "what is being asked" (shape) from the variable dimensions (e.g. time interval, context flags).
- Decompose the response along those variable dimensions into independently-keyed fragments.
- Recompose on retrieval — assemble from cached fragments plus a narrowed backend fetch for whatever is missing.
Contrast with opaque caching¶
| Property | Opaque KV cache | Query-structure-aware cache |
|---|---|---|
| Cache key | Full query string / query hash | Shape hash + structural axis coordinate(s) |
| Hit granularity | All-or-nothing | Per-fragment |
| Rolling-window query | Miss every shift | Reuse all overlapping fragments |
| Complexity | Low | Requires query parser + composable response |
| Backend load on partial overlap | Full query re-run | One narrowed backend query |
Canonical instance: Netflix Druid interval cache¶
Netflix's Druid cache implements this for time-series queries. The cache (Source: sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid):
- Parses
timeseries/groupByqueries. - Extracts datasource + filters + aggregations + granularity + interval + context.
- Computes SHA-256 over the query with the interval and volatile context removed — so the shape hash is stable across different overlapping time windows.
- Keys the cache as a map-of-maps: outer key = shape hash, inner keys = granularity-aligned bucket timestamps (encoded big-endian for lexicographic range scans).
- Decomposes responses along the time axis into per-bucket entries.
- Recomposes by concatenating cached buckets + tail-fetched fresh buckets, sorted by timestamp.
The crucial move is separating the shape of the query from the interval it's evaluated over — so shifted intervals don't invalidate the cached buckets that still overlap.
Context-aware subtlety¶
Netflix's cache includes certain Druid context properties in the hash because "there are some context properties that can alter the response structure or contents." Volatile context (e.g. query ID) is excluded from the hash; structural context (e.g. properties that change response shape) is included. This is the general problem with query-structure-aware caches: the parser has to understand which parts of the input are structural-to-the-response and which are not.
What this enables¶
Query-structure-aware caching is what makes patterns/interval-aware-query-cache actually work:
- Without structural decomposition, a shifted window invalidates the cache.
- Without structural normalisation, two equivalent queries (different literal strings) would miss cache.
- Without structural reassembly, a cache hit-plus-miss would require re-fetching the whole query from the backend.
Seen in¶
- sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid — canonical wiki instance at Netflix for time-series queries.
Related¶
- concepts/granularity-aligned-bucket — the decomposition axis.
- concepts/rolling-window-query — the workload.
- concepts/time-series-bucketing — the broader bucketing framing.
- systems/netflix-druid-interval-cache
- patterns/interval-aware-query-cache — the pattern this concept underpins.