SYSTEM Cited by 4 sources
Apache Parquet¶
Apache Parquet (2013) is a columnar on-disk file format for tabular data. It became the de-facto object-level format for tables on cloud object stores, enabling the "data lake over S3" pattern at scale — and the basis on which richer table formats like systems/apache-iceberg are built.
Why it won¶
- Columnar layout — reads only the columns needed by a query, cutting I/O dramatically for analytical workloads.
- Per-column compression and encoding (dictionary, RLE, delta), exploiting columnar value locality.
- Statistics per row-group — min/max/null-count let readers skip entire row groups that can't match a predicate.
- Language-agnostic — Java, C++, Python, Rust, Go all have mature readers/writers; no proprietary lock-in.
- Good fit for immutable object storage — one Parquet file per object, append-oriented write pattern, no in-place updates required.
(Source: sources/2025-03-14-allthingsdistributed-s3-simplicity-is-table-stakes)
Scale (per the S3-at-19 post, 2025)¶
"S3 stores exabytes of parquet data and serves hundreds of petabytes of Parquet data every day."
This is the rare combination of an open format that has become de-facto infrastructure. It's why Iceberg and Delta Lake both adopted Parquet as their data file layer — piggybacking on decade-plus of reader/writer maturity and installed base.
Where Parquet stops and table formats begin¶
Parquet answers "how do I store a row-group of rows efficiently in one object?" It does not answer:
- How do I mutate individual rows without rewriting the object?
- How do I evolve the schema across many objects?
- How do I version the logical table?
- How do I atomically commit a set of objects as "the new table state"?
These are the questions an open table format like systems/apache-iceberg layers on top — typically by writing a metadata / snapshot layer that points at Parquet data files.
Seen in¶
- sources/2025-03-14-allthingsdistributed-s3-simplicity-is-table-stakes — Parquet framed as the on-object data layer under Iceberg; cited at exabyte-stored / hundreds-of-petabytes-served-per-day scale on S3.
- sources/2025-01-29-datadog-husky-efficient-compaction-at-datadog-scale
— Datadog's Husky uses a Parquet-like custom columnar format
("similar to Parquet with one row group and many pages, but
specially designed for observability data"). Notable deltas vs.
stock Parquet: inline column headers for streaming-discovery during
compaction (vs. Parquet's footer-at-end), adaptive row-group size
sized against the heaviest input column (logs
messageup to 75 KiB/event), and per-column fragment-metadata that goes beyond min/max to a trimmed-FSA-regex (patterns/trimmed-automaton-predicate-filter). - sources/2026-04-07-allthingsdistributed-s3-files-and-the-changing-face-of-s3 — Warfield cites Parquet's scale on S3 as the structural-data context for the 2024-2026 multi-primitive expansion: S3 "stores exabytes of parquet data and averages over 25 million requests per second to that format alone." The magnitude of that installed base is the reason Iceberg-over-Parquet became a de-facto table layer and why S3 Tables absorbed the managed-Iceberg role.
- sources/2024-07-29-aws-amazons-exabyte-scale-migration-from-apache-spark-to-ray-on-ec2 — Amazon Retail BDT's Ray compactor reads Parquet from S3 and materialises to systems/apache-arrow in-memory. Q1 2024: 1.5 EiB of Parquet input decoded into ~4 EiB of in-memory Arrow during compaction. Joint optimisation with systems/daft on Parquet I/O yielded +24% production cost-efficiency; median single-column Parquet read was −55% vs PyArrow and −91% vs S3Fs. One of the largest public Parquet-at-scale numbers outside S3's own fleet-wide exabyte / 25M-rps statistic.