Skip to content

SYSTEM Cited by 1 source

Apache Hudi

Apache Hudi (Hadoop Upserts Deletes and Incrementals) is one of the three canonical open table formats alongside systems/apache-iceberg and systems/delta-lake — all layered over systems/apache-parquet on object storage to provide transactional row-level inserts, updates, deletes, and schema evolution on top of immutable objects.

Minimum viable framing for this wiki: same structural role as Iceberg — see concepts/open-table-format for the shared shape. Hudi's distinguishing emphasis is near-real-time upsert ingestion with two on-disk storage modes: Copy-on-Write (concepts/copy-on-write-merge) and Merge-on-Read (read-time merge of a base file + a log of delta records).

Seen in

Last updated · 200 distilled / 1,178 read