CONCEPT Cited by 2 sources
Materialized view¶
Definition¶
A materialized view (MV) is a database object that persists the result of a query as a physical table. Unlike a regular view — which is a stored query definition that is re-evaluated on each read — a materialized view stores the query's output rows, so reads scan the stored result directly instead of re-executing the underlying computation.
CREATE MATERIALIZED VIEW order_report AS
SELECT region, SUM(orders)
FROM customer_and_order_table
GROUP BY region;
Reading order_report scans a small pre-aggregated table; the
underlying GROUP BY over customer_and_order_table is not
re-executed on each read.
The three axes of MV design¶
1. Refresh strategy¶
| Strategy | Mechanism | Trade-off |
|---|---|---|
| Full refresh | Re-execute the MV definition; replace the persisted result. | Simple; expensive at scale; staleness between refreshes. |
| Incremental refresh (IVM) | Compute only the delta induced by input changes. | Continuous freshness; engine complexity; coverage limited by MV shape. |
| On-demand refresh | Refresh only when a read requests it. | Bounds cost to actual demand; introduces read-time latency. |
2. Use case¶
| Use case | Shape | Examples |
|---|---|---|
| Query acceleration | Pre-aggregations powering dashboards. | Star-schema rollups; daily/hourly aggregates; metric-view materializations. |
| ETL replacement | Derived tables in a pipeline that would otherwise need hand-written incremental code. | The SDP / Enzyme thesis: "if MVs can be efficiently and incrementally maintained, it will significantly simplify ETL workloads which otherwise require writing complex custom code." (Source: sources/2026-05-29-databricks-databricks-at-sigmod-2026) |
| Semantic-layer materialisation | The semantic-layer / headless-BI metric definition is materialised behind the same query surface. | concepts/metric-view-materialization / systems/databricks-metric-views. |
3. Authoring language¶
| Language | Industrial coverage |
|---|---|
| SQL | Universal across MV-supporting systems. |
| Python | Rare in industrial IVM; explicitly claimed by Enzyme ("While most industry solutions just focus on SQL, Enzyme supports MVs specified in Python as well"). See concepts/multi-language-materialized-view. |
| DSL (e.g. metric-view DSL) | Newer surface; see concepts/metric-view-materialization. |
Why MVs are increasingly central in lakehouse architectures¶
Three trends elevate MVs from a query-acceleration sidecar to a first-class architectural primitive:
-
IVM coverage has caught up to production MV shapes. Joins, windows, aggregations, and combinations are all incrementally maintainable in modern engines (the Enzyme claim). This unlocks ETL-replacement use cases beyond simple aggregations.
-
Declarative pipelines need a unit of declaration. SDP's
@dp.materialized_viewdecorator is a Python-level analogue of the SQLCREATE MATERIALIZED VIEWstatement; the MV is the unit of declarative authorship. -
Semantic layers materialise their definitions. The headless-BI semantic layer hosts metric definitions; the metric-view materialisation capability lets the platform automatically maintain pre-aggregated results behind the metric surface.
What MVs replace¶
| Replaced artifact | What MVs supply instead |
|---|---|
| Hand-written aggregate tables | Declarative pre-aggregation. |
| Hand-written incremental ETL (merge / upsert / backfill code) | The IVM engine handles incremental maintenance. |
| BI-tool-side aggregate-table swapping | Transparent query rewriting onto materialised metric views. |
| Refresh pipelines (Airflow DAGs that rebuild aggregates nightly) | Continuous incremental refresh from the IVM engine. |
Seen in¶
- sources/2026-05-29-databricks-databricks-at-sigmod-2026 —
introduces Enzyme as the IVM engine behind
SDP's
@dp.materialized_view, names the four-claim coverage axis (joins + windows + aggregations + combinations; non-deterministic functions; Python + SQL; cost-model strategy), and articulates the ETL-replacement thesis. - sources/2026-05-27-databricks-bi-serving-pointers-maximizing-for-performance-and-tco — covers the metric-view-materialization pattern as MVs hosted behind a headless-BI semantic layer surface.
Related¶
- concepts/incremental-view-maintenance — the technique that makes MVs an ETL primitive.
- concepts/metric-view-materialization — MVs as semantic-layer substrate.
- concepts/non-deterministic-mv-maintenance — IVM under
current_date()and AI functions. - concepts/multi-language-materialized-view — Python + SQL MVs.
- systems/enzyme-ivm — Databricks' IVM engine.
- systems/lakeflow-spark-declarative-pipelines — the user-facing declarative-pipeline surface where MVs are authored.
- systems/databricks-metric-views — semantic-layer metric views with materialisation support.
- patterns/auto-materialized-aggregation-via-semantic-layer — pattern for hands-off MV maintenance behind a metric surface.
- patterns/query-rewrite-to-pre-aggregated-materialization — transparent routing of queries onto a materialised result.