SYSTEM Cited by 1 source
Databricks SQL Warehouses¶
Databricks SQL warehouses (DBSQL warehouses) are the compute layer for BI / analytical SQL queries on the Databricks Lakehouse. They are the serving tier in the four-layer BI serving stack (physical → semantic → materialization → warehouse / cache). Two architectural properties are load-bearing for the BI workload shape: serverless auto-scaling (absorbs concurrency bursts; pay-per-use rather than peak-capacity) and a two-tier cache hierarchy — the disk cache (warehouse-local hot data) and the query result cache (QRC) (repeated queries served without re-execution). For dashboards with consistent query patterns, these caching tiers "turn many requests into millisecond-latency responses at near-zero compute cost". (Source: sources/2026-05-27-databricks-bi-serving-pointers-maximizing-for-performance-and-tco)
What it is¶
A SQL warehouse is the named runtime endpoint a BI tool, SQL notebook, AI/BI Dashboard, or Genie session connects to in order to execute SQL against the Lakehouse. Three properties make it the right serving tier for BI workloads specifically:
- High concurrency: BI workloads have "highly concurrent" query patterns; the warehouse provisions enough Photon-backed compute to serve dashboard refreshes from many concurrent users.
- Latency-sensitive: BI users are interactively waiting; the warehouse minimises tail latency via the cache hierarchy + Photon's vectorised query engine.
- Repetitive query patterns: BI dashboards re-run the same queries on schedule and on user interaction — the substrate that makes QRC architecturally load-bearing rather than incidental.
The two-tier cache hierarchy¶
The source explicitly names the two tiers and their roles:
"Lean on DBSQL's caching tiers. Disk cache keeps hot data local to the warehouse, and query result cache (QRC) serves repeated queries without re-execution. For dashboards with consistent query patterns, caching turns many requests into millisecond-latency responses at near-zero compute cost."
| Tier | What it caches | Hit cost | Miss path |
|---|---|---|---|
| Query Result Cache (QRC) | Full query result, keyed on query text + input table version. | Near-zero compute (return cached result). | Falls through to disk cache + re-execution. |
| Disk cache | Hot Parquet files / data blocks, materialised on warehouse-local SSD. | Local I/O instead of cloud-storage round-trip. | Falls through to cloud storage + Predictive-Optimization-managed Delta data skipping. |
The compounding shape: a repeating dashboard query hits QRC on the second-and-subsequent invocations (so 0 compute), and even the first invocation hits the disk cache for files served recently (so reduced cloud-storage I/O). For BI workloads, this collapses most of the substrate's working set into local SSD + in-memory cached results.
Generalised at concepts/dbsql-caching-tiers.
Auto-scaling: pay for what you use¶
The first TCO lever named in the source: "Right-size your SQL warehouse. Use serverless SQL warehouses with auto-scaling for BI concurrency bursts. You pay for what you use, not peak capacity."
The architectural shape: serverless SQL warehouses scale up within seconds to absorb a burst of concurrent dashboard users (morning logins, after-meeting refreshes, end-of-quarter reporting) and scale down during quiet periods. The right-sizing guidance is workload-shape-aware — minimum cluster size for steady load, maximum for burst capacity, plus auto-suspend after idle.
This is the same architectural shape as scale-to-zero applied to the analytical serving tier: the warehouse's compute footprint follows the concurrency curve, not the peak demand curve.
Composition with the rest of the BI serving stack¶
BI consumers (dashboards / Genie / BI tools)
│
▼ SQL connection
┌─────────────────────────────────────┐
│ DBSQL Warehouse ◄─── this page │
│ + Query Result Cache │
│ + Disk cache │
│ + Photon vectorised exec │
│ + Auto-scaling │
└─────────────────┬───────────────────┘
│ resolves MEASURE() calls
▼
┌────── Metric Views ──────┐
│ (semantic layer in UC) │
└─────────────┬────────────┘
│ query rewriting
▼
Materialization → UC managed tables
with Liquid Clustering + Predictive Optimization
The warehouse is the layer where caching turns the substrate investment into millisecond responses for the consumer. The caching tiers and the auto-scaling work in opposite directions: caching reduces the compute needed per query; auto-scaling provisions enough compute for the queries that miss cache.
Reflexive monitoring via system tables¶
The fourth TCO lever: "Monitor with system tables. System tables
such as system.billing.usage and system.query.history can be
used to track BI usage by dashboard, user, and warehouse. Build
Metric Views and an AI/BI Dashboard on system tables to gain
visibility into your BI usage."
Two tables called out by name:
system.billing.usage— per-warehouse / per-user / per-job cost attribution. The substrate for cost dashboards segmented by dashboard owner, BI tool, or business unit.system.query.history— per-query history with execution metadata. The substrate for slow-query investigation, repeat- query identification (QRC candidate detection), and per-user query-volume attribution.
The reflexive shape: build a Metric View over the system tables
and surface it through Genie / dashboards so warehouse
observability sits inside the same primitive the warehouse
itself serves. This is one of the source's worked examples — a
metv_dbsql_metrics Metric View over system.billing.usage and
system.query.history queryable from both AI/BI Dashboards and
Genie.
DirectQuery vs extracts¶
The third TCO lever from the source: "Eliminate redundant data movement. Serve BI directly from the lakehouse via DirectQuery or live connections, rather than using extracts or imports."
The architectural antipattern: BI tools that pull a copy of the data into their own engine (Tableau extracts, Power BI imports) introduce a staleness window, duplicate the storage, and break lineage at the BI-tool boundary. The recommended shape is to keep queries against the warehouse with DirectQuery / live connections so:
- Lineage stays coherent end-to-end in Unity Catalog.
- Governance (row filters, column masks, audit) applies at the warehouse, not at the BI tool.
- Caching lives in the warehouse where it benefits all consumers, not per-BI-tool.
TCO geometry (verbatim from source)¶
"Faster queries and lower cost aren't competing goals — every optimization that reduces data scanned also reduces the compute you pay for. And each optimization in the stack compounds. Liquid clustering and better statistics improve data skipping and query plans. Materializations can be refreshed incrementally, reducing the compute SQL warehouses need to serve dashboards."
Three named TCO levers compose:
- Less data scanned — physical-layer optimisations (managed tables + liquid clustering + Predictive Optimization data skipping).
- Less compute per query — semantic-layer materialization + query rewriting routes to pre-aggregated form.
- Less compute total — caching tiers (QRC + disk cache) + auto-scaling pay-per-use.
The compounding shape is the article's central thesis: each layer multiplies the savings of the layer below.
Operational considerations¶
- Warehouse sizing is workload-shape-dependent. No specific per-tenant or per-workload guidance disclosed in the source beyond "right-size"; the principles are auto-scaling for burst, auto-suspend for idle.
- QRC freshness contract. The source does not detail QRC invalidation semantics under high write churn — a query that caches a result, followed by a write to the underlying table, followed by a repeat query — but the implicit contract is that the cache key includes a table-version component so writes invalidate cache entries.
- Disk-cache locality under warehouse scaling: when a warehouse scales out, new compute nodes start with empty disk caches and pay the cloud-storage I/O cost for the first request — a transient cold-start cost during burst.
- No multi-tenant isolation depth. The source treats the warehouse as a single named resource; multi-tenant deployments with per-tenant warehouses, per-team warehouses, or workload-class-segregated warehouses are not discussed in this post.
Seen in¶
- sources/2026-05-27-databricks-bi-serving-pointers-maximizing-for-performance-and-tco
— first wiki disclosure of DBSQL warehouses as a distinct
named system in the BI serving context (previously implicit
in the Databricks + UC frame). Names
the two-tier cache hierarchy (disk cache + QRC) verbatim;
the auto-scaling + pay-per-use shape; the system-table-driven
observability surface (
system.billing.usage+system.query.history); the DirectQuery vs extracts antipattern resolution. Reserved for future ingests: warehouse-sizing envelope (concurrency / row-volume / latency contract per t-shirt size), QRC invalidation algorithm, disk-cache eviction policy, cold-start cost during scale-out, multi-warehouse isolation patterns.
Related¶
- systems/databricks — the platform.
- systems/photon — the vectorised query engine inside the warehouse.
- systems/databricks-metric-views — what the warehouse
resolves
MEASURE()calls against. - systems/databricks-predictive-optimization — the physical-layer optimisation substrate that compounds with the warehouse caching tiers.
- systems/uc-managed-tables — the table substrate the warehouse queries.
- systems/databricks-genie — one of the consumer surfaces.
- systems/unity-catalog — the governance plane.
- concepts/dbsql-caching-tiers — the generalised two-tier-cache concept.
- concepts/headless-bi-semantic-layer — the semantic-layer shape the warehouse serves.