REDPANDA 2026-06-30

How Redpanda Cloud Topics rethinks Kafka compaction¶

Summary¶

This post explains how Redpanda Cloud Topics fundamentally rethinks Kafka log compaction by decoupling it from broker-local replicas. In traditional disk-based Kafka, every broker independently compacts its own replica — burning redundant CPU, competing with produce/consume workloads, and introducing a coordination problem for tombstone removal. Cloud Topics eliminates all three issues: compaction operates once against the canonical copy in object storage, any broker (or any shard on any node) can run it, and a pull-based scheduling model prevents compaction starvation on overloaded shards.

Key takeaways¶

Kafka compaction is a two-pass algorithm — scan the log to build a key→latest-offset map, then rewrite the log keeping only the latest record per key. The real challenge is making this scale: handling partitions with unbounded key counts and not starving the broker of CPU.
SHA-256 hashing fixes the memory problem — Redpanda's key-offset map uses SHA-256 hashes (32 bytes) + 8-byte offsets = fixed 40 bytes per entry. With a default 128 MiB allocation, this fits ~3.3 million unique keys per pass.
Backward scanning of dirty ranges guarantees that the first indexed offset for any key is the latest one, enabling early termination when the map is full and convergence to full deduplication over finite compactions.
Redundant compaction is the fundamental problem with disk-based Kafka — with replication.factor=3, the same logical data is compacted 3 times independently on 3 brokers, wasting CPU and creating a tombstone coordination race.
Cloud Topics decouples compaction from the broker — data lives once in object storage; compaction runs once against the canonical copy. This eliminates redundant work, frees broker CPU for produce/consume, and removes the tombstone coordination problem entirely.
Any shard on any node can be a compactor — because compaction operates on object-store data + metastore metadata rather than local replica state, work can be distributed freely across the cluster, independent of partition leadership.
Pull-based scheduling prevents starvation — a scheduler (on shard 0 of each compaction node) maintains a priority queue based on dirty ratio (min.cleanable.dirty.ratio) and time since eligibility (max.compaction.lag.ms). Workers on every shard poll for work, decoupling compaction scheduling from partition placement.
Multi-part uploads cap memory usage — compacted L1 objects are uploaded via multi-part upload, scaling memory with part size rather than total object size. No spill to disk required.
Optimistic concurrency via compaction_epoch — the metastore tracks a per-partition integer compaction_epoch incremented on each successful update. If two nodes compact the same log concurrently, the stale writer's update is rejected — preventing reintroduction of stale data.

Operational numbers¶

Parameter	Value
Key-offset map entry size	40 bytes (32-byte SHA-256 + 8-byte offset)
Default key-offset map allocation	128 MiB
Keys per compaction pass	~3.3 million

Architecture: Cloud Topics compaction vs traditional Kafka¶

Dimension	Traditional Kafka	Cloud Topics
Compaction copies	RF independent compactions	1 (canonical object store copy)
Where it runs	Each broker, competing with produce/consume	Any shard on any node
Scheduling	Coupled to partition leader placement	Pull-based priority queue; decoupled from placement
Tombstone coordination	Requires cross-broker protocol	Eliminated — single copy, no divergence possible
Memory scaling	Per-object size	Per-part size (multi-part upload)
Concurrency control	N/A (each broker owns its replica)	Optimistic — `compaction_epoch` in metastore

Caveats¶

The post does not provide throughput benchmarks for Cloud Topics compaction (bytes/s, latency to convergence).
compaction_epoch conflict resolution is optimistic — under heavy concurrent compaction, wasted work is possible (though not a correctness issue).
The scheduling heuristic (dirty ratio + lag) may still allow starvation in extreme edge cases (many large partitions, limited cluster CPU).

Source¶

concepts/log-compaction — general concept
concepts/compaction-replication-race — the bug Cloud Topics eliminates
concepts/coordinated-compaction — Redpanda's fix for disk-based mode
systems/redpanda-cloud-topics — the overall Cloud Topics architecture
systems/redpanda-cloud-topics-metastore — stores compaction_epoch
sources/2026-06-25-redpanda-kafkas-log-compaction-corrupts-data — companion article on the Kafka bug