Skip to content

CONCEPT Cited by 1 source

Slowly-Changing Dimension (SCD)

Slowly-Changing Dimension (SCD) is the dimensional-modeling pattern for dimension tables whose attributes change over time — customer address, product category, employee department — at a rate much lower than the fact tables that reference them, but not zero. SCD policies define how those changes are recorded: overwrite, add a new row, or track old and new side by side.

The classic SCD types

  • SCD Type 1 — Overwrite. Mutate the existing row in place; no history kept. Cheapest, but destroys audit trail.
  • SCD Type 2 — Row versioning. Close the current row (set valid_to, is_current = false) and insert a new row for the new value (with valid_from = change timestamp, is_current = true). Preserves full history; joins against facts need an as-of-timestamp predicate.
  • SCD Type 3 — Limited history. Keep previous_value columns alongside current_value. Fixed-width history; simple but lossy.

Why SCD appears in this wiki

SCD is one of the canonical incremental-update workloads that motivated row-level semantics on open table formats (concepts/open-table-format). The natural SQL expression of an SCD Type 1 or Type 2 update is a MERGE INTO statement:

MERGE INTO customer_dim t
USING staging_customer_updates s
ON t.customer_id = s.customer_id
WHEN MATCHED AND t.address <> s.address THEN
  UPDATE SET address = s.address, updated_at = s.event_ts
WHEN NOT MATCHED THEN
  INSERT *;

On systems/apache-iceberg and its peers, this is a targeted row-level update, not a partition rewrite. Executing it as INSERT OVERWRITE on the dimension table would rewrite the whole partition on every small batch of updates — prohibitive on anything larger than toy scale. See patterns/merge-into-over-insert-overwrite for the operational prescription. (Source: sources/2025-09-30-expedia-prefer-merge-into-over-insert-overwrite)

Relationship to CDC

CDC is the ingest shape — a stream of insert/update/delete deltas from a source database. SCD is the merge policy — how those deltas are resolved into the warehouse dimension. The two compose: a CDC stream of customer updates lands in a staging table; a MERGE statement applies the staging deltas to the customer dimension under an SCD Type 1 or Type 2 policy; the resulting dimension is a row-level-updated Iceberg table best stored under MOR with periodic copy-on-write compaction.

Why it's a "slowly" changing dimension

The name is warehouse-era — dimensions change slowly relative to facts. Operationally this doesn't relax the correctness requirements; it just means the workload shape is many small batches of row-level updates, not few big partition rewrites, which is precisely why MERGE INTO (not INSERT OVERWRITE) is the right SQL surface on Iceberg. (Source: sources/2025-09-30-expedia-prefer-merge-into-over-insert-overwrite)

Seen in

Last updated · 200 distilled / 1,178 read