Skip to content

PATTERN Cited by 1 source

Client-side compression over broker compression

Problem

Kafka-API brokers support two compression topologies:

  1. Client-side — producer compresses batches; broker treats the compressed bytes as opaque; consumer decompresses.
  2. Broker-side — client sends uncompressed; broker compresses before writing; broker decompresses before sending to consumer.

Both produce the same on-wire compression ratio at steady state, but the CPU cost distribution differs dramatically. Operators choosing the wrong topology push CPU onto the wrong tier.

Solution

Always compress on the client, not the broker. Set the topic's compression.type to producer — meaning the broker passes through whatever codec the producer chose, unchanged.

Redpanda's verbatim recommendation (Source: sources/2025-04-23-redpanda-need-for-speed-9-tips-to-supercharge-redpanda):

"Compress on the client, not the broker (topic configuration for compression should be set to producer)."

Why this is the right topology

Three arguments:

1. Broker is the scarce resource

The broker is shared across all producers and consumers of a topic. Every watt of broker CPU spent on compression cannot be spent on replication, fetch handling, compaction, or other broker-internal work. Clients, by contrast, are horizontally elastic — you can always add another client host; you can't always add another broker.

Pushing compression CPU to the client reduces the broker's load profile to its essential work: accept bytes, replicate bytes, serve bytes. All opaque-byte handling; very cheap per byte.

2. Compression compounds with batching on the client

Effective batch size is determined by producer-side dynamics (linger, batch-size buffer, partitioner, fan-out). The batch boundary is the correct compression boundary — multiple records in a batch share dictionary redundancy; compression ratios improve with batch size. If the broker recompresses, it's operating on whatever batch structure the producer happened to form, with no added compression win.

3. Broker stays in opaque-byte regime

From concepts/compression-compaction-cpu-cost:

"The compaction process runs in the broker and is actually the only use case where the broker reads message-level details from a topic. Usually, Redpanda treats the data as opaque bytes that need to be sent without reading them in detail."

Client-side compression preserves this invariant. The broker accepts a compressed batch as opaque bytes, replicates it as opaque bytes, writes it to disk as opaque bytes, and serves it to consumers as opaque bytes. The broker-CPU floor is minimal.

Broker-side compression breaks this — now the broker must decompress + recompress, just like the compaction pathway. The CPU tax applies to every topic on the broker, not just compacted topics.

Concrete topic configuration

Kafka / Redpanda topic configs:

compression.type Behaviour
producer (recommended) Broker passes producer-compressed batches through unchanged
uncompressed Broker stores / serves uncompressed; discards producer compression
zstd / lz4 / snappy / gzip Broker re-encodes with specified codec

Set compression.type=producer on all new topics unless there's a specific contract-enforcement reason to force a broker-side codec.

Batches, not messages

The post adds a companion rule:

"Clients compress batches, not messages, therefore increasing batching will also make compression more effective."

Producer clients compress at the batch boundary; compressing individual records would lose the cross-record redundancy that makes compression work. A well-batched workload (patterns/batch-over-network-to-broker) is therefore a prerequisite for compression to deliver its full ratio.

The chain:

Small batches → poor compression ratio → bandwidth savings small → maybe not worth CPU
Big batches   → good compression ratio → bandwidth savings big   → strongly worth CPU

Which means: compress + batch well together, or don't compress at all.

Exception: compacted topics

Compacted topics (with cleanup.policy=compact) force the broker to read record keys, which forces decompression of compressed records. For compacted topics, compression is partially broker-side CPU no matter what — canonicalised as concepts/compression-compaction-cpu-cost. The client-side-only guarantee of this pattern breaks down at the compaction boundary.

For compacted topics that must be compressed, prefer LZ4 (low decompress/recompress CPU cost) over ZSTD.

Seen in

Last updated · 470 distilled / 1,213 read