CONCEPT Cited by 1 source
Multi-part upload for compaction¶
Definition¶
When Cloud Topics compaction rewrites L1 objects to object storage, it uses multi-part uploads rather than buffering the entire object in memory before uploading. This bounds the compactor's memory usage to the size of a single upload part, regardless of total object size.
Why it matters¶
Without multi-part uploads, a compactor writing large L1 objects would either: - Need memory proportional to object size (potentially gigabytes), or - Need to spill intermediate results to local disk (re-introducing disk dependency)
Multi-part uploads are natively supported by all object storage backends Redpanda targets (S3, GCS, ADLS), making this a universal optimization.
Trade-offs¶
- Adds complexity: multi-part uploads must be completed or aborted; incomplete uploads accumulate storage charges
- Part ordering must be maintained for correctness
- Minimum part size constraints (e.g., 5 MiB on S3) set a floor on memory usage