PATTERN Cited by 1 source

Incremental clustering on write¶

Problem¶

Periodic full-table re-clustering produces unbounded write amplification at scale. The canonical instance is Z-Ordering: each OPTIMIZE ZORDER BY rewrites entire files (or partitions worth of files) including data that was already correctly clustered before the new ingest. As the table grows and the new-data fraction shrinks against table size, the rewrite cost grows linearly with table size on every maintenance run.

The 2026-06-01 Databricks "Debunking 8 data layout myths" post diagnoses this verbatim:

"Z-Order has to be rerun periodically as new data lands, and each rerun rewrites large amounts of old, possibly already-clustered data to restore clustering quality. With continuous ingestion, the cost of keeping data well-clustered with Z-Order grows along with the table."

The structural property of the periodic-rewrite approach: rewrite cost is decoupled from new-data volume and coupled to table size. Predictable failure mode: at sufficient scale, the rewrite cycle takes longer than the inter-write interval and the clustering quality degrades indefinitely.

Solution¶

Maintain clustering layout incrementally on the write path. New writes are placed into files that preserve locality on the clustering keys; periodic compaction operates on small, recent file sets rather than the whole table; the rewrite cost stays proportional to new-data volume, not table size.

The 2026-06-01 source's framing:

"Liquid clusters incrementally, including at write time, so the layout stays optimal without unnecessary rewrites."

Structural pieces¶

Piece	What it does
Write-path layout	Newly written files honour the table's clustering keys at the moment of write, using locality-aware file placement and intra-file sort.
Incremental compaction	Background OPTIMIZE operates on recent / unbalanced subsets of files; not the whole table.
Layout-state tracking	Transaction log records which files are well-clustered vs. need-rebalancing; planner / OPTIMIZE both consume this.
No periodic full-rewrite	The layout doesn't require periodic table-wide rebuilding to maintain quality.

In practice¶

Day 1: Write 1 GB. New files clustered on (date, customer_id).
Day 2: Write 1 GB. New files clustered on (date, customer_id).
       Background OPTIMIZE merges fragmented files within the
       recent batch. Cost: small, proportional to new data.
Day 365: Table is 365 GB.
       Background OPTIMIZE still operates on recent batches and
       fragmentation hotspots. Cost: small, proportional to
       new data per cycle.

Compare with periodic full-rewrite:
Day 365: Background OPTIMIZE ZORDER must re-cluster all 365 GB
         to reincorporate Day 365's writes into the global
         Z-order. Cost: O(table size).

Cost-decoupling property¶

The pattern's load-bearing economic property: maintenance cost per unit time is approximately constant as the table grows.

Periodic-rewrite (Z-Order) cost over time:
  cost(n) = O(table_size(n)) per cycle
         ≈ O(n) for steady ingest
         → quadratic total cost over table lifetime

Incremental-on-write (Liquid) cost over time:
  cost(n) = O(new_data(n)) per cycle
         ≈ O(1) for steady ingest
         → linear total cost over table lifetime

The difference matters at PB scale: a 1 PB table maintained via periodic rewrite pays maintenance cost roughly proportional to 1 PB on every cycle. The same table with incremental clustering pays maintenance cost roughly proportional to the daily / hourly new-data volume — orders of magnitude less.

Sibling failure modes the pattern avoids¶

Failure mode	Periodic rewrite	Incremental on-write
Maintenance cost > inter-write interval	Layout quality degrades indefinitely	Maintenance keeps up with ingest
Storage cost spike during rewrite	2× table size during rewrite (old + new files coexist)	Bounded by new-write volume
OPTIMIZE blocks readers	Long-running OPTIMIZE has wide windows of incidental impact	Short, frequent operations
Fragmentation between maintenance cycles	Fragmentation accumulates until next cycle	Continuously bounded

Composition with managed-table substrate¶

The pattern composes with automatic table optimization: the substrate ( Predictive Optimization) decides when to run incremental compaction based on observed write patterns and clustering-state telemetry. The user declares clustering keys (patterns/clustering-keys-as-engine-input); the substrate owns the maintenance schedule.

When this doesn't apply¶

Append-only logs with no clustering goals — the clustering layer is unnecessary; raw append plus periodic compaction is sufficient.
Tables with infrequent writes — periodic rewrite cost is tolerable when writes are rare (monthly batch loads).
Legacy tools that require periodic table-wide rebuilds — some downstream consumers may snapshot tables on a periodic cadence and re-read everything; incremental layout doesn't help if the consumer reads the whole table anyway.

Sibling patterns on the wiki¶

Lazy compaction (LSM tier-merge) — same principle at the LSM-tree storage layer; merging happens in tiers as data ages, not as periodic full-table rewrites.
Multi-strategy compaction (patterns/multi-strategy-compaction) — Magic Pocket's L1 / L2 / L3 sequence; each tier handles a different cost-vs-effectiveness trade-off in compaction.
Background reconciler for read-path optimization (patterns/background-reconciler-for-read-path-optimization) — sibling shape on streaming brokers: write-path produces unoptimised files, reconciler produces read-optimised files in the background.

The shared principle: bound maintenance cost to incremental work, not table state.

Seen in¶

sources/2026-06-01-databricks-debunking-8-data-layout-myths-why-liquid-clustering-outperfo — First wiki canonicalisation as a named pattern. The Z-Ordering critique ("unnecessary rewrites... the cost of keeping data well-clustered with Z-Order grows along with the table") and the Liquid Clustering contrast ("Liquid clusters incrementally, including at write time, so the layout stays optimal without unnecessary rewrites") make the pattern load-bearing for the post's economic case at PB scale. Reserved for future ingests: the precise telemetry that triggers incremental compaction, the algorithmic difference between Liquid's incremental layout and competing approaches, and the worst-case behaviour under high-velocity write bursts.

systems/liquid-clustering — canonical instance.
systems/delta-lake — table format substrate.
systems/databricks-predictive-optimization — runs the incremental compaction work.
concepts/z-ordering — the periodic-rewrite predecessor this pattern supersedes.
concepts/write-amplification — the cost dimension this pattern bounds.
concepts/automatic-table-optimization — the substrate-side property that decides incremental compaction scheduling.
concepts/over-partitioning — sibling failure mode that emerges when teams pre-commit to fixed layouts.
patterns/clustering-keys-as-engine-input — the broader abstraction this maintenance discipline sits beneath.