CONCEPT Cited by 2 sources
Small file problem on object storage¶
The small file problem is the pathology of a streaming-to-lakehouse pipeline producing many small object-store files (Parquet / ORC / Avro) instead of fewer, well-sized ones. The canonical symptom is that downstream query performance, metadata consistency, and operational cost all degrade roughly linearly with file count at a given data volume.
Why it hurts¶
For a target table of fixed total byte size, splitting the data across N small files vs fewer large files multiplies several O(N) costs:
- Listing cost — the Iceberg manifest plus the underlying object- store listing both pay per-file overhead. Spark/Trino/Snowflake query planners read every manifest entry to compute the scan set; tables with millions of small files can see planning time dominate query latency.
- Parquet open + footer amortisation — each Parquet file carries a footer with column offsets and statistics. Per-file open cost is dominated by the footer round-trip for small files; column-read throughput is suppressed when the number of columns scanned is small relative to footer overhead.
- Iceberg metadata bloat — every file is a manifest entry, every commit produces a snapshot. Snapshot-based query engines retain manifest history for time travel; small-file-heavy tables produce manifest chains that are expensive to traverse.
- Compaction burn — the downstream fix is a recurring compaction job that merges small files into larger ones; this costs compute, IO, and operator attention (job scheduling, failure recovery, snapshot coordination).
- Per-request object-store cost — S3, GCS, ADLS all charge per- PUT / per-GET. Small files amplify the number of PUTs during write and GETs during scan, both billed.
Root cause in streaming sinks¶
In a streaming sink, the flush trigger shape determines small-file risk. Timer-driven flushing — the Kafka Connect-era default — writes one object per flush interval regardless of data volume. On quiet or bursty streams, this produces many tiny files.
concepts/data-driven-flushing (Redpanda Connect Iceberg output, 2026-03-05 launch) is canonicalised on the wiki as the mitigation pattern: flush only when data is present, letting batch size track workload.
The Redpanda Iceberg output canonicalisation¶
The Iceberg output for Redpanda Connect names the small-file problem explicitly as a foil verbatim (Source: sources/2026-03-05-redpanda-introducing-iceberg-output-for-redpanda-connect):
"Redpanda Connect uses data-driven flushing. It only executes a flush operation when there is actual data to move, preventing the 'small file problem' on object storage and ensuring you aren't wasting compute cycles on empty operations."
The pathology appears in scare-quotes in the launch post — recognised as an industry-canonical term in the streaming-to-lakehouse community.
Related axes¶
- Iceberg snapshot cadence — every commit creates a snapshot; commit frequency and file-flush frequency are coupled.
- Partition granularity — finer partition spec (e.g. hourly vs daily) amplifies file count proportionally.
- Compaction policy — some lakehouse runtimes (e.g. AWS S3 Tables, Databricks Auto-optimize) ship managed compaction; self-managed Iceberg deployments need recurring compaction jobs.
Seen in¶
- Redpanda — Introducing Iceberg output for Redpanda Connect (2026-03-05) — canonicalised as the foil for data-driven flushing. Launch post names the problem in scare-quotes as a recognised streaming-sink pathology.
- Redpanda — Under the hood: Cloud Topics architecture (2026-03-30) — surfaces the same problem one altitude down (streaming broker writing primary record data to S3 instead of lakehouse sink) and names the canonical write-path mitigation: multi- partition coalescing into a single L0 file per batch window, "specifically to minimize the cost of object storage; by aggregating smaller writes into larger batches, we significantly reduce the number of PUT requests sent to S3." Read-side mitigation is the background Reconciler rewriting L0 into larger, per-partition, offset-sorted L1 files. See concepts/l0-l1-file-compaction-for-object-store-streaming.
Related¶
- concepts/data-driven-flushing — the mitigation
- concepts/data-lakehouse, concepts/open-table-format
- concepts/effective-batch-size — analogous batch-size axis upstream
- concepts/l0-l1-file-compaction-for-object-store-streaming — the streaming-broker altitude of the same problem.
- concepts/placeholder-batch-metadata-in-raft — enables the multi-partition coalescing write-side mitigation.
- systems/apache-iceberg, systems/apache-parquet
- systems/aws-s3, systems/google-cloud-storage
- systems/redpanda-cloud-topics — canonical instance of the streaming-broker-altitude mitigation.
- patterns/streaming-broker-as-lakehouse-bronze-sink
- patterns/object-store-batched-write-with-raft-metadata — the write-side mitigation pattern.
- patterns/background-reconciler-for-read-path-optimization — the read-side mitigation pattern.