CONCEPT Cited by 1 source
DBSQL caching tiers¶
Definition¶
A two-tier cache hierarchy in a Databricks SQL warehouse: a disk cache (warehouse-local SSD holding hot Parquet files / data blocks) and a query result cache (QRC) (full query results keyed on query text + input table version, served without re-execution). For repetitive BI workloads, "caching turns many requests into millisecond-latency responses at near-zero compute cost".
Verbatim from the Databricks BI Serving Pointers source: "Lean on DBSQL's caching tiers. Disk cache keeps hot data local to the warehouse, and query result cache (QRC) serves repeated queries without re-execution."
The two tiers¶
| Tier | Granularity | Hit | Miss path |
|---|---|---|---|
| Query Result Cache (QRC) | Full query result (rows + metadata) keyed on query text + table version | Near-zero compute — return the cached result. | Falls through to disk cache + re-execution. |
| Disk cache | Hot Parquet files / data blocks on warehouse-local SSD | Local I/O instead of cloud-storage round-trip. | Falls through to cloud storage + Predictive-Optimization-managed Delta data skipping. |
The compounding shape: a repeating dashboard query hits QRC on the second-and-subsequent invocations (so 0 compute), and even the first invocation hits the disk cache for files served recently (so reduced cloud-storage I/O).
Why the two-tier shape works for BI¶
BI workloads have three structural properties that make this hierarchy load-bearing:
- Repetitive query patterns: the same dashboard widget queries run on schedule and on user interaction. QRC serves these for free on hits.
- Hot data working set: dashboards tend to query recent / high-traffic data, which fits the disk cache.
- Latency-sensitivity: BI users are interactively waiting; a QRC hit is sub-second by construction.
The first property is what distinguishes BI from ad-hoc analytics. Ad-hoc workloads have low cache-hit rates because each query is novel; BI has high cache-hit rates because the same queries repeat.
Composition with materialization¶
The DBSQL caching tiers compose with metric-view materialization in a clean hierarchy:
Consumer query
│
▼
Query Result Cache ─────► hit: return result (0 compute)
│ miss
▼
Materialized form ─────► hit: scan pre-aggregated rows
│ miss
▼
Disk cache ─────► hit: local-SSD I/O
│ miss
▼
Cloud storage + Delta data skipping
+ Liquid Clustering + Predictive Optimization stats
Each layer absorbs as many queries as it can; what falls through hits the next layer. The architectural value: the base substrate (well-clustered, well-statistics-ed, well-optimised managed tables) is fast even at the bottom; the layers above make the common case sub-second.
Cache invalidation contract¶
The source doesn't document QRC's invalidation semantics in detail — the implicit contract is that the QRC key includes a table-version component so writes invalidate cache entries on the underlying tables. The disk cache is more forgiving (it caches at the file level, and Delta files are immutable — new versions write new files).
The freshness contract under high write churn is reserved for future ingests.
Cold-start under scale-out¶
When a SQL warehouse scales out to absorb a concurrency burst, new compute nodes start with empty disk caches. The first queries on those nodes pay the cloud-storage I/O cost. This is a transient cold-start cost during burst — significant for sub-second BI workloads where even 100 ms of cloud-storage I/O is visible to the user. Mitigation: warehouse pre-warming or auto-scaling rules that scale out before the burst arrives.
Failure modes¶
- QRC blindness to non-deterministic queries. Queries with
current_timestamp()or session-dependent functions can't cache cleanly because the result depends on call time. The source doesn't address this; standard QRC implementations exclude such queries from the cache. - Disk-cache thrashing on burst-and-cool patterns. If the warehouse scales out, builds disk cache, then scales in, the disk cache is lost — the next burst pays cold-start I/O again.
- Cache-induced staleness via stale QRC under failed invalidation. If a write doesn't invalidate the QRC entry (bug in the substrate), users see stale data without warning.
Sibling cache hierarchies on the wiki¶
| Domain | Two-tier shape |
|---|---|
| Web (frontend) | concepts/stale-while-revalidate-cache — service worker hot result cache + IndexedDB disk cache. |
| OS / IO | Linux page cache + on-disk file system. |
| LLM inference | concepts/kv-cache (per-request) + concepts/implicit-prompt-caching (cross-request prefix). |
| MySQL / Postgres | Buffer pool + on-disk pages. |
The shared principle: a small fast tier in front of a larger slow tier, with the small tier sized to the working set and the large tier optimised for capacity.
What this is NOT¶
- Not the same as a CDN cache. CDN caches sit at network edge; DBSQL caches sit at warehouse compute. Different optimisation goals (geo-locality vs warehouse-locality).
- Not magic for ad-hoc queries. Workloads where every query is novel see no QRC benefit and limited disk-cache benefit.
- Not user-controllable per-query (in this disclosure). The source treats caching as an automatic substrate property; per- query cache directives are not discussed.
Seen in¶
- sources/2026-05-27-databricks-bi-serving-pointers-maximizing-for-performance-and-tco — first wiki canonicalisation of the two-tier DBSQL caching shape. Names disk cache + QRC verbatim and ties them to "millisecond-latency responses at near-zero compute cost" for dashboard-pattern workloads. Reserved for future ingests: QRC invalidation algorithm, disk-cache eviction policy under warehouse scaling, cache-warming primitives, per-query cache control surfaces.
Related¶
- systems/databricks-sql-warehouses — the substrate where these caches live.
- concepts/cache-hit-rate — the metric that quantifies cache effectiveness.
- concepts/headless-bi-semantic-layer — the layer above caching (caching wins compound from the metric view down).
- concepts/metric-view-materialization — composes with caching in the hierarchy above.