CONCEPT Cited by 1 source

Merge-on-read (MOR)¶

Merge-on-Read (MOR) is the row-level update strategy in open table formats where mutations are persisted as separate delta files (append-only insert / update / delete records) that sit alongside the immutable base files, and the merge happens at query time — the reader combines base + delta to reconstruct current state. The paired strategy is copy-on-write (COW), which rewrites whole data files on every mutation.

MOR is the default-or-offered strategy on all three canonical open table formats: systems/apache-iceberg, systems/apache-hudi (explicit MERGE_ON_READ table type, as distinct from COPY_ON_WRITE), and systems/delta-lake.

The trade-off in one sentence¶

MOR trades fast writes for slower reads, then buys back the read performance with periodic compaction that collapses accumulated deltas into read-optimized base files — which is just copy-on-write merge run in the background rather than on the write path.

Why MOR exists¶

COW gives readers the simplest possible job — open a single set of immutable files, scan them — but pays for that read simplicity on the write path: every mutation rewrites the base files containing the affected rows, with full I/O amplification.

That write amplification is prohibitive for:

Change Data Capture (CDC) ingests — a continuous stream of small upserts whose data volume is dwarfed by the file-rewrite cost COW would impose. (Source: sources/2025-09-30-expedia-prefer-merge-into-over-insert-overwrite)
Slowly-changing dimensions (SCD) — classic dimensional-modeling workloads where a dimension table gets continuous row-level edits.
Targeted MERGE INTO workloads where the affected row count is a tiny fraction of the table.

MOR sidesteps the write amplification by writing only the delta and deferring the merge.

Mechanics¶

Base files — immutable columnar files (typically Parquet) representing table state as of the last compaction.
Delta files — smaller, append-only files recording row-level mutations since that snapshot. Two common shapes: position deletes (delete-by-row-position in a named base file) and equality deletes (delete-by-predicate over column values); inserts are just new data files.
Readers merge base + delta at scan time, applying deletes and overlaying updates per the merge key.
Compaction periodically rewrites base+delta into a new, smaller set of base files with zero deltas — this is exactly copy-on-write merge running in the background. (Source: sources/2025-09-30-expedia-prefer-merge-into-over-insert-overwrite)

Performance properties¶

Write path (Expedia, 2025-09-30):

Faster writes — append-only delta files, no base-file rewrite.
Reduced I/O — only mutated rows touch disk; object-store operation count drops.
Scales with data volume — write cost is proportional to delta size, not table size.

Read path:

Baseline read cost rises with delta-file count per touched base file. One or two delta files: negligible overhead. Many: measurable per-scan merge cost.
Predicate pushdown still works on base files; deltas have to be consulted to correct the output. Min/max stats on base files stay honest between compactions.
Compaction cadence is the primary knob: too rare → query regressions as delta count grows; too frequent → write-side cost creeps back toward COW.

The compaction invariant¶

"Compaction becomes necessary as the number of delta files grows to maintain optimal performance." (Source: sources/2025-09-30-expedia-prefer-merge-into-over-insert-overwrite)

MOR is only a cost win averaged over the compaction window. A MOR system with broken or disabled compaction degrades into an unbounded delta log and loses all read-side properties. Every production MOR deployment therefore ships (or adopts) a compactor — Iceberg's rewrite_data_files action, Hudi's Compactor service, Delta's OPTIMIZE, or — at exabyte scale — something like Amazon BDT's systems/deltacat Flash Compactor on systems/ray (see concepts/copy-on-write-merge).

When to use which¶

Workload	Strategy
Full-partition refresh	`INSERT OVERWRITE`
Large-batch updates touching most rows of many files	COW
CDC / SCD / incremental upsert	MOR + `MERGE INTO`
Targeted `MERGE INTO` with natural key	MOR
Read-heavy table, rare updates	COW
Write-heavy table, frequent updates	MOR (+ aggressive compaction)

Seen in¶

sources/2025-09-30-expedia-prefer-merge-into-over-insert-overwrite — Expedia Group Tech primer positioning MOR as the default for MERGE INTO workloads on systems/apache-iceberg; names the three write-side wins (fast writes / reduced I/O / improved-with- compaction query perf), the three cost axes (lower storage, efficient resource use, scalability), and the load-bearing caveat (compaction mandatory as delta count grows).