Skip to content

CONCEPT Cited by 1 source

Multi-part upload for compaction

Definition

When Cloud Topics compaction rewrites L1 objects to object storage, it uses multi-part uploads rather than buffering the entire object in memory before uploading. This bounds the compactor's memory usage to the size of a single upload part, regardless of total object size.

Why it matters

Without multi-part uploads, a compactor writing large L1 objects would either: - Need memory proportional to object size (potentially gigabytes), or - Need to spill intermediate results to local disk (re-introducing disk dependency)

Multi-part uploads are natively supported by all object storage backends Redpanda targets (S3, GCS, ADLS), making this a universal optimization.

Trade-offs

  • Adds complexity: multi-part uploads must be completed or aborted; incomplete uploads accumulate storage charges
  • Part ordering must be maintained for correctness
  • Minimum part size constraints (e.g., 5 MiB on S3) set a floor on memory usage

Seen in

Last updated ยท 567 distilled / 1,685 read