PATTERN Cited by 1 source
Client-side compression over broker compression¶
Problem¶
Kafka-API brokers support two compression topologies:
- Client-side — producer compresses batches; broker treats the compressed bytes as opaque; consumer decompresses.
- Broker-side — client sends uncompressed; broker compresses before writing; broker decompresses before sending to consumer.
Both produce the same on-wire compression ratio at steady state, but the CPU cost distribution differs dramatically. Operators choosing the wrong topology push CPU onto the wrong tier.
Solution¶
Always compress on the client, not the broker. Set the
topic's compression.type to producer — meaning the broker
passes through whatever codec the producer chose, unchanged.
Redpanda's verbatim recommendation (Source: sources/2025-04-23-redpanda-need-for-speed-9-tips-to-supercharge-redpanda):
"Compress on the client, not the broker (topic configuration for compression should be set to producer)."
Why this is the right topology¶
Three arguments:
1. Broker is the scarce resource¶
The broker is shared across all producers and consumers of a topic. Every watt of broker CPU spent on compression cannot be spent on replication, fetch handling, compaction, or other broker-internal work. Clients, by contrast, are horizontally elastic — you can always add another client host; you can't always add another broker.
Pushing compression CPU to the client reduces the broker's load profile to its essential work: accept bytes, replicate bytes, serve bytes. All opaque-byte handling; very cheap per byte.
2. Compression compounds with batching on the client¶
Effective batch size is determined by producer-side dynamics (linger, batch-size buffer, partitioner, fan-out). The batch boundary is the correct compression boundary — multiple records in a batch share dictionary redundancy; compression ratios improve with batch size. If the broker recompresses, it's operating on whatever batch structure the producer happened to form, with no added compression win.
3. Broker stays in opaque-byte regime¶
From concepts/compression-compaction-cpu-cost:
"The compaction process runs in the broker and is actually the only use case where the broker reads message-level details from a topic. Usually, Redpanda treats the data as opaque bytes that need to be sent without reading them in detail."
Client-side compression preserves this invariant. The broker accepts a compressed batch as opaque bytes, replicates it as opaque bytes, writes it to disk as opaque bytes, and serves it to consumers as opaque bytes. The broker-CPU floor is minimal.
Broker-side compression breaks this — now the broker must decompress + recompress, just like the compaction pathway. The CPU tax applies to every topic on the broker, not just compacted topics.
Concrete topic configuration¶
Kafka / Redpanda topic configs:
compression.type |
Behaviour |
|---|---|
producer (recommended) |
Broker passes producer-compressed batches through unchanged |
uncompressed |
Broker stores / serves uncompressed; discards producer compression |
zstd / lz4 / snappy / gzip |
Broker re-encodes with specified codec |
Set compression.type=producer on all new topics unless there's
a specific contract-enforcement reason to force a broker-side
codec.
Batches, not messages¶
The post adds a companion rule:
"Clients compress batches, not messages, therefore increasing batching will also make compression more effective."
Producer clients compress at the batch boundary; compressing individual records would lose the cross-record redundancy that makes compression work. A well-batched workload (patterns/batch-over-network-to-broker) is therefore a prerequisite for compression to deliver its full ratio.
The chain:
Small batches → poor compression ratio → bandwidth savings small → maybe not worth CPU
Big batches → good compression ratio → bandwidth savings big → strongly worth CPU
Which means: compress + batch well together, or don't compress at all.
Exception: compacted topics¶
Compacted topics (with cleanup.policy=compact) force the
broker to read record keys, which forces decompression of
compressed records. For compacted topics, compression is
partially broker-side CPU no matter what — canonicalised as
concepts/compression-compaction-cpu-cost. The client-side-only
guarantee of this pattern breaks down at the compaction boundary.
For compacted topics that must be compressed, prefer LZ4 (low decompress/recompress CPU cost) over ZSTD.
Seen in¶
- sources/2025-04-23-redpanda-need-for-speed-9-tips-to-supercharge-redpanda
— canonical wiki source. "Compress on the client, not the
broker" rule;
compression.type=producertopic-config instruction; batches-not-messages companion rule.
Related¶
- systems/kafka, systems/redpanda — Kafka-API brokers where this pattern applies.
- concepts/compression-codec-tradeoff — which codec the client should pick.
- concepts/compression-compaction-cpu-cost — the exception case where client-side-only guarantee breaks.
- concepts/effective-batch-size — prerequisite for compression to achieve its full ratio.
- patterns/batch-over-network-to-broker — the enabling pattern.