CONCEPT Cited by 1 source
Merge-on-read (MOR)¶
Merge-on-Read (MOR) is the row-level update strategy in open table formats where mutations are persisted as separate delta files (append-only insert / update / delete records) that sit alongside the immutable base files, and the merge happens at query time — the reader combines base + delta to reconstruct current state. The paired strategy is copy-on-write (COW), which rewrites whole data files on every mutation.
MOR is the default-or-offered strategy on all three canonical open
table formats: systems/apache-iceberg, systems/apache-hudi
(explicit MERGE_ON_READ table type, as distinct from
COPY_ON_WRITE), and systems/delta-lake.
The trade-off in one sentence¶
MOR trades fast writes for slower reads, then buys back the read performance with periodic compaction that collapses accumulated deltas into read-optimized base files — which is just copy-on-write merge run in the background rather than on the write path.
Why MOR exists¶
COW gives readers the simplest possible job — open a single set of immutable files, scan them — but pays for that read simplicity on the write path: every mutation rewrites the base files containing the affected rows, with full I/O amplification.
That write amplification is prohibitive for:
- Change Data Capture (CDC) ingests — a continuous stream of small upserts whose data volume is dwarfed by the file-rewrite cost COW would impose. (Source: sources/2025-09-30-expedia-prefer-merge-into-over-insert-overwrite)
- Slowly-changing dimensions (SCD) — classic dimensional-modeling workloads where a dimension table gets continuous row-level edits.
- Targeted
MERGE INTOworkloads where the affected row count is a tiny fraction of the table.
MOR sidesteps the write amplification by writing only the delta and deferring the merge.
Mechanics¶
- Base files — immutable columnar files (typically Parquet) representing table state as of the last compaction.
- Delta files — smaller, append-only files recording row-level mutations since that snapshot. Two common shapes: position deletes (delete-by-row-position in a named base file) and equality deletes (delete-by-predicate over column values); inserts are just new data files.
- Readers merge base + delta at scan time, applying deletes and overlaying updates per the merge key.
- Compaction periodically rewrites base+delta into a new, smaller set of base files with zero deltas — this is exactly copy-on-write merge running in the background. (Source: sources/2025-09-30-expedia-prefer-merge-into-over-insert-overwrite)
Performance properties¶
Write path (Expedia, 2025-09-30):
- Faster writes — append-only delta files, no base-file rewrite.
- Reduced I/O — only mutated rows touch disk; object-store operation count drops.
- Scales with data volume — write cost is proportional to delta size, not table size.
Read path:
- Baseline read cost rises with delta-file count per touched base file. One or two delta files: negligible overhead. Many: measurable per-scan merge cost.
- Predicate pushdown still works on base files; deltas have to be consulted to correct the output. Min/max stats on base files stay honest between compactions.
- Compaction cadence is the primary knob: too rare → query regressions as delta count grows; too frequent → write-side cost creeps back toward COW.
The compaction invariant¶
"Compaction becomes necessary as the number of delta files grows to maintain optimal performance." (Source: sources/2025-09-30-expedia-prefer-merge-into-over-insert-overwrite)
MOR is only a cost win averaged over the compaction window. A
MOR system with broken or disabled compaction degrades into an
unbounded delta log and loses all read-side properties. Every
production MOR deployment therefore ships (or adopts) a compactor —
Iceberg's rewrite_data_files action, Hudi's Compactor service,
Delta's OPTIMIZE, or — at exabyte scale — something like Amazon
BDT's systems/deltacat Flash Compactor on systems/ray
(see concepts/copy-on-write-merge).
When to use which¶
| Workload | Strategy |
|---|---|
| Full-partition refresh | INSERT OVERWRITE |
| Large-batch updates touching most rows of many files | COW |
| CDC / SCD / incremental upsert | MOR + MERGE INTO |
Targeted MERGE INTO with natural key |
MOR |
| Read-heavy table, rare updates | COW |
| Write-heavy table, frequent updates | MOR (+ aggressive compaction) |
(See also patterns/merge-into-over-insert-overwrite.)
Seen in¶
- sources/2025-09-30-expedia-prefer-merge-into-over-insert-overwrite
— Expedia Group Tech primer positioning MOR as the default for
MERGE INTOworkloads on systems/apache-iceberg; names the three write-side wins (fast writes / reduced I/O / improved-with- compaction query perf), the three cost axes (lower storage, efficient resource use, scalability), and the load-bearing caveat (compaction mandatory as delta count grows).