CONCEPT Cited by 1 source

DBSQL caching tiers¶

Definition¶

A two-tier cache hierarchy in a Databricks SQL warehouse: a disk cache (warehouse-local SSD holding hot Parquet files / data blocks) and a query result cache (QRC) (full query results keyed on query text + input table version, served without re-execution). For repetitive BI workloads, "caching turns many requests into millisecond-latency responses at near-zero compute cost".

Verbatim from the Databricks BI Serving Pointers source: "Lean on DBSQL's caching tiers. Disk cache keeps hot data local to the warehouse, and query result cache (QRC) serves repeated queries without re-execution."

The two tiers¶

Tier	Granularity	Hit	Miss path
Query Result Cache (QRC)	Full query result (rows + metadata) keyed on query text + table version	Near-zero compute — return the cached result.	Falls through to disk cache + re-execution.
Disk cache	Hot Parquet files / data blocks on warehouse-local SSD	Local I/O instead of cloud-storage round-trip.	Falls through to cloud storage + Predictive-Optimization-managed Delta data skipping.

The compounding shape: a repeating dashboard query hits QRC on the second-and-subsequent invocations (so 0 compute), and even the first invocation hits the disk cache for files served recently (so reduced cloud-storage I/O).

Why the two-tier shape works for BI¶

BI workloads have three structural properties that make this hierarchy load-bearing:

Repetitive query patterns: the same dashboard widget queries run on schedule and on user interaction. QRC serves these for free on hits.
Hot data working set: dashboards tend to query recent / high-traffic data, which fits the disk cache.
Latency-sensitivity: BI users are interactively waiting; a QRC hit is sub-second by construction.

The first property is what distinguishes BI from ad-hoc analytics. Ad-hoc workloads have low cache-hit rates because each query is novel; BI has high cache-hit rates because the same queries repeat.

Composition with materialization¶

The DBSQL caching tiers compose with metric-view materialization in a clean hierarchy:

Consumer query
    │
    ▼
Query Result Cache  ─────► hit: return result (0 compute)
    │ miss
    ▼
Materialized form   ─────► hit: scan pre-aggregated rows
    │ miss
    ▼
Disk cache          ─────► hit: local-SSD I/O
    │ miss
    ▼
Cloud storage + Delta data skipping
    + Liquid Clustering + Predictive Optimization stats

Each layer absorbs as many queries as it can; what falls through hits the next layer. The architectural value: the base substrate (well-clustered, well-statistics-ed, well-optimised managed tables) is fast even at the bottom; the layers above make the common case sub-second.

Cache invalidation contract¶

The source doesn't document QRC's invalidation semantics in detail — the implicit contract is that the QRC key includes a table-version component so writes invalidate cache entries on the underlying tables. The disk cache is more forgiving (it caches at the file level, and Delta files are immutable — new versions write new files).

The freshness contract under high write churn is reserved for future ingests.

Cold-start under scale-out¶

When a SQL warehouse scales out to absorb a concurrency burst, new compute nodes start with empty disk caches. The first queries on those nodes pay the cloud-storage I/O cost. This is a transient cold-start cost during burst — significant for sub-second BI workloads where even 100 ms of cloud-storage I/O is visible to the user. Mitigation: warehouse pre-warming or auto-scaling rules that scale out before the burst arrives.

Failure modes¶

QRC blindness to non-deterministic queries. Queries with current_timestamp() or session-dependent functions can't cache cleanly because the result depends on call time. The source doesn't address this; standard QRC implementations exclude such queries from the cache.
Disk-cache thrashing on burst-and-cool patterns. If the warehouse scales out, builds disk cache, then scales in, the disk cache is lost — the next burst pays cold-start I/O again.
Cache-induced staleness via stale QRC under failed invalidation. If a write doesn't invalidate the QRC entry (bug in the substrate), users see stale data without warning.

Sibling cache hierarchies on the wiki¶

Domain	Two-tier shape
Web (frontend)	concepts/stale-while-revalidate-cache — service worker hot result cache + IndexedDB disk cache.
OS / IO	Linux page cache + on-disk file system.
LLM inference	concepts/kv-cache (per-request) + concepts/implicit-prompt-caching (cross-request prefix).
MySQL / Postgres	Buffer pool + on-disk pages.

The shared principle: a small fast tier in front of a larger slow tier, with the small tier sized to the working set and the large tier optimised for capacity.

What this is NOT¶

Not the same as a CDN cache. CDN caches sit at network edge; DBSQL caches sit at warehouse compute. Different optimisation goals (geo-locality vs warehouse-locality).
Not magic for ad-hoc queries. Workloads where every query is novel see no QRC benefit and limited disk-cache benefit.
Not user-controllable per-query (in this disclosure). The source treats caching as an automatic substrate property; per- query cache directives are not discussed.

Seen in¶

sources/2026-05-27-databricks-bi-serving-pointers-maximizing-for-performance-and-tco — first wiki canonicalisation of the two-tier DBSQL caching shape. Names disk cache + QRC verbatim and ties them to "millisecond-latency responses at near-zero compute cost" for dashboard-pattern workloads. Reserved for future ingests: QRC invalidation algorithm, disk-cache eviction policy under warehouse scaling, cache-warming primitives, per-query cache control surfaces.

systems/databricks-sql-warehouses — the substrate where these caches live.
concepts/cache-hit-rate — the metric that quantifies cache effectiveness.
concepts/headless-bi-semantic-layer — the layer above caching (caching wins compound from the metric view down).
concepts/metric-view-materialization — composes with caching in the hierarchy above.