PATTERN Cited by 1 source

Auto-materialized aggregation via semantic layer¶

Pattern¶

When BI dashboards on a metric become slow, enable materialization on the metric in the semantic layer rather than hand-building aggregate tables + refresh pipelines + BI-tool query updates. The platform automatically maintains pre-aggregated results behind the same metric definition the BI tools already query, with incremental refresh and transparent query rewriting to the pre-aggregated form. The consumer's query never changes.

Forces¶

Aggregate-table sprawl when every BI tool grows its own pre-aggregation tables for the same metrics.
Refresh-pipeline maintenance burden — each aggregate table has its own Airflow/dbt job + monitoring + failure modes.
BI-tool query updates every time the aggregation logic changes — the BI tool has to be repointed at the new aggregate.
Governance drift when the aggregate table has different access controls than the source.
Cost itemisation pain — aggregation pipelines become their own line items on the compute bill.

The source's framing of the antipattern: "A dashboard query takes 30 seconds, so someone builds an aggregate table to speed it up. That table needs a refresh pipeline. The pipeline needs monitoring. Then a second BI tool needs the same data in a slightly different shape, so someone builds another aggregate table using a separate pipeline. Before long, you're managing a sprawl of aggregates, extracts, and tool-specific semantic layers — each with its own staleness window, its own governance gaps, and its own line item on the compute bill."

Solution shape (verbatim from the source)¶

"Metric View materialization offers a simpler alternative. When you enable materialization on a Metric View, the platform automatically maintains pre-aggregated results behind the same metric definitions your BI tools already query — no separate aggregate tables to build, no BI tool queries to refactor."

The pattern collapses three coupled artifacts into one:

Pre-pattern (3 coupled artifacts)	Post-pattern (1 governed primitive)
Aggregate tables (in storage)	Pre-aggregated form lives behind the Metric View
Refresh pipelines (Airflow / dbt jobs)	Incremental refresh runs as a substrate property
BI-tool queries pointed at aggregate tables	BI-tool queries continue to call `MEASURE()`; rewrite handles routing

Concrete steps¶

Pair this pattern with patterns/governed-metric-as-headless-bi-substrate — the pre-condition is that the metric is already defined as a Metric View (or equivalent governed primitive). Without that, there's nothing to materialize behind.
Identify the highest-traffic metrics — the source's "Get started" guidance is "enable materialization for your highest-traffic metrics", not blanket-enable. Use system.query.history (Databricks) or equivalent to find the metric+grain combinations that get queried most.
Enable materialization on the Metric View. Concrete syntax varies by platform; the Databricks shape is a property on the Metric View definition.
Verify transparent routing. Check the query plan for confirmation that the consumer query is hitting the materialization instead of falling back to base-table compute. The source's evidence is a Genie query plan screenshot showing exactly this.
Monitor the refresh latency under high ingest. If the materialization is more stale than the consumer requires, tune the refresh cadence or unmaterialise the metric.

Trade-offs¶

Choice	Implication
Selective materialization (only highest-traffic metrics)	Avoids storage / refresh-cost sprawl; requires telemetry to choose what to materialize.
Blanket materialization	Storage and refresh costs dominate; only justified for narrow workloads where everything is hot.
Incremental refresh	Refresh cost amortised against write activity; freshness contract under high churn must be validated.
Transparent query rewriting	Consumer doesn't need to know about materialization; rewrite optimiser must cover the queries it expects to.

Failure modes¶

Stale materialization at high ingest. If refresh latency exceeds the freshness window the consumer cares about, the materialization either lies (returns stale data) or falls back to base-table compute. Mitigation: validate the freshness contract under realistic write churn.
Query-rewriter coverage gaps. A materialization that should match a query but doesn't (because the rewriter is conservative or the grain doesn't quite line up) silently falls back to base-table compute — producing inconsistent performance the consumer notices but can't explain.
Materialization sprawl — the same antipattern as pre-aggregation tables, just at a different layer. Mitigation: rationalise materializations periodically based on actual hit rate.
Cost-attribution opacity. When the consumer call is the same whether or not a materialization is hit, cost-attribution per metric becomes harder unless the platform exposes it.

Sibling patterns¶

patterns/governed-metric-as-headless-bi-substrate — the pre-condition pattern. Without a governed metric, there's nothing to materialize behind.
patterns/query-rewrite-to-pre-aggregated-materialization — the optimiser-side counterpart pattern that makes transparent routing work.
patterns/cdf-incremental-replacing-full-rescan — the same incremental-refresh principle at a different layer (Bronze→Silver pipeline transitions in Octopus Energy MHHS).
patterns/precomputed-sketch-column-in-delta-table — a related precompute-then-serve pattern (probabilistic data structures stored as Delta-table columns for fast approximate answers).

Where this shows up on the wiki¶

systems/databricks-metric-views — canonical instance.
concepts/metric-view-materialization — the generalised concept.
concepts/headless-bi-semantic-layer — the substrate this pattern materializes behind.

Seen in¶

sources/2026-05-27-databricks-bi-serving-pointers-maximizing-for-performance-and-tco — first wiki canonicalisation as a pattern. Names the three-artifacts-collapse-into-one structural move and the four sub-properties (auto pre-aggregation / incremental refresh / intelligent query rewriting / transparent routing). Reserved for future ingests: refresh-latency contract under high ingest, query-rewriter coverage envelope, materialization storage-vs-speedup tuning, the relationship between materialization granularity and rewriter coverage.

systems/databricks-metric-views — the canonical instance.
systems/uc-managed-tables — the storage substrate materializations sit on.
concepts/metric-view-materialization — the generalised concept.
concepts/headless-bi-semantic-layer — the substrate the pattern operates within.
patterns/governed-metric-as-headless-bi-substrate — the pre-condition pattern.
patterns/query-rewrite-to-pre-aggregated-materialization — the optimiser-side companion.