Skip to content

CONCEPT Cited by 1 source

DBSQL caching tiers

Definition

A two-tier cache hierarchy in a Databricks SQL warehouse: a disk cache (warehouse-local SSD holding hot Parquet files / data blocks) and a query result cache (QRC) (full query results keyed on query text + input table version, served without re-execution). For repetitive BI workloads, "caching turns many requests into millisecond-latency responses at near-zero compute cost".

Verbatim from the Databricks BI Serving Pointers source: "Lean on DBSQL's caching tiers. Disk cache keeps hot data local to the warehouse, and query result cache (QRC) serves repeated queries without re-execution."

The two tiers

Tier Granularity Hit Miss path
Query Result Cache (QRC) Full query result (rows + metadata) keyed on query text + table version Near-zero compute — return the cached result. Falls through to disk cache + re-execution.
Disk cache Hot Parquet files / data blocks on warehouse-local SSD Local I/O instead of cloud-storage round-trip. Falls through to cloud storage + Predictive-Optimization-managed Delta data skipping.

The compounding shape: a repeating dashboard query hits QRC on the second-and-subsequent invocations (so 0 compute), and even the first invocation hits the disk cache for files served recently (so reduced cloud-storage I/O).

Why the two-tier shape works for BI

BI workloads have three structural properties that make this hierarchy load-bearing:

  1. Repetitive query patterns: the same dashboard widget queries run on schedule and on user interaction. QRC serves these for free on hits.
  2. Hot data working set: dashboards tend to query recent / high-traffic data, which fits the disk cache.
  3. Latency-sensitivity: BI users are interactively waiting; a QRC hit is sub-second by construction.

The first property is what distinguishes BI from ad-hoc analytics. Ad-hoc workloads have low cache-hit rates because each query is novel; BI has high cache-hit rates because the same queries repeat.

Composition with materialization

The DBSQL caching tiers compose with metric-view materialization in a clean hierarchy:

Consumer query
Query Result Cache  ─────► hit: return result (0 compute)
    │ miss
Materialized form   ─────► hit: scan pre-aggregated rows
    │ miss
Disk cache          ─────► hit: local-SSD I/O
    │ miss
Cloud storage + Delta data skipping
    + Liquid Clustering + Predictive Optimization stats

Each layer absorbs as many queries as it can; what falls through hits the next layer. The architectural value: the base substrate (well-clustered, well-statistics-ed, well-optimised managed tables) is fast even at the bottom; the layers above make the common case sub-second.

Cache invalidation contract

The source doesn't document QRC's invalidation semantics in detail — the implicit contract is that the QRC key includes a table-version component so writes invalidate cache entries on the underlying tables. The disk cache is more forgiving (it caches at the file level, and Delta files are immutable — new versions write new files).

The freshness contract under high write churn is reserved for future ingests.

Cold-start under scale-out

When a SQL warehouse scales out to absorb a concurrency burst, new compute nodes start with empty disk caches. The first queries on those nodes pay the cloud-storage I/O cost. This is a transient cold-start cost during burst — significant for sub-second BI workloads where even 100 ms of cloud-storage I/O is visible to the user. Mitigation: warehouse pre-warming or auto-scaling rules that scale out before the burst arrives.

Failure modes

  • QRC blindness to non-deterministic queries. Queries with current_timestamp() or session-dependent functions can't cache cleanly because the result depends on call time. The source doesn't address this; standard QRC implementations exclude such queries from the cache.
  • Disk-cache thrashing on burst-and-cool patterns. If the warehouse scales out, builds disk cache, then scales in, the disk cache is lost — the next burst pays cold-start I/O again.
  • Cache-induced staleness via stale QRC under failed invalidation. If a write doesn't invalidate the QRC entry (bug in the substrate), users see stale data without warning.

Sibling cache hierarchies on the wiki

Domain Two-tier shape
Web (frontend) concepts/stale-while-revalidate-cache — service worker hot result cache + IndexedDB disk cache.
OS / IO Linux page cache + on-disk file system.
LLM inference concepts/kv-cache (per-request) + concepts/implicit-prompt-caching (cross-request prefix).
MySQL / Postgres Buffer pool + on-disk pages.

The shared principle: a small fast tier in front of a larger slow tier, with the small tier sized to the working set and the large tier optimised for capacity.

What this is NOT

  • Not the same as a CDN cache. CDN caches sit at network edge; DBSQL caches sit at warehouse compute. Different optimisation goals (geo-locality vs warehouse-locality).
  • Not magic for ad-hoc queries. Workloads where every query is novel see no QRC benefit and limited disk-cache benefit.
  • Not user-controllable per-query (in this disclosure). The source treats caching as an automatic substrate property; per- query cache directives are not discussed.

Seen in

  • sources/2026-05-27-databricks-bi-serving-pointers-maximizing-for-performance-and-tcofirst wiki canonicalisation of the two-tier DBSQL caching shape. Names disk cache + QRC verbatim and ties them to "millisecond-latency responses at near-zero compute cost" for dashboard-pattern workloads. Reserved for future ingests: QRC invalidation algorithm, disk-cache eviction policy under warehouse scaling, cache-warming primitives, per-query cache control surfaces.
Last updated · 542 distilled / 1,571 read