CONCEPT Cited by 2 sources
Producer backpressure batch growth¶
Definition¶
Producer backpressure batch growth is the counterintuitive
behaviour of Kafka-API producers under broker saturation: when the
broker is heavily loaded, producers grow their batches past the
configured batch.size ceiling, not shrink them. Canonicalised by
Redpanda's 2024-11-19 batch-tuning explainer:
"[Kafka] clients have an internal mechanism for tracking the in-flight batches sent to the cluster and awaiting acknowledgment. If Redpanda is heavily loaded, a client with all of the 'max in flight' slots in use will experience a form of backpressure, such that the client will continue to add records into a queued batch even beyond the maximum batch size. This will result in increased batch sizes while the cluster is heavily loaded." (Source: sources/2024-11-19-redpanda-batch-tuning-in-redpanda-for-optimized-performance-part-1)
Mechanism¶
The Kafka client's transport pipeline has a bounded number of
in-flight request slots (default 5 per broker connection via
max.in.flight.requests.per.connection). When all slots are
occupied:
- The client cannot dispatch a new batch — no free slot.
- Records continue to arrive from the application.
- The client has to put them somewhere — the only option is to keep enqueueing them into the currently-open batch (or queue them into a closed-but-waiting batch via an overflow path).
- The open batch grows past
batch.sizebecause the size threshold's dispatch action ("close and send") is blocked by the missing slot.
This behaviour is protective: the alternative would be to either
drop records or block the producer thread. Enqueueing past
batch.size preserves the ordering and durability contract at
the cost of inflating batches.
The pseudo-code decomposition¶
From Redpanda's explainer, the trigger logic branches on the in-flight-cap state:
if (client not at max-in-flight cap):
# Normal regime: dispatch as soon as either threshold fires.
if (current linger > linger.ms || next message would exceed batch.size):
close_and_send_current_batch()
else:
# Backpressure regime: can't dispatch; keep enqueueing
# until next message would overflow current batch, then
# queue up the closed batch for a future slot.
if next message would exceed batch.size:
close_and_enqueue_current_batch()
In the normal regime, batch.size is a ceiling. In the
backpressure regime, batch.size is the close-and-queue threshold
— after which the next batch opens and continues growing, also
potentially past the ceiling.
Second-order effect: adding brokers can decrease batch size¶
Redpanda's explainer names the counterintuitive downstream:
"One consequence is that adding additional brokers to a loaded cluster can sometimes cause batch sizes to decrease since there is less backpressure."
The cycle:
- Loaded cluster → produce responses delayed → max-in-flight slots
stay occupied → producer queues past batch.size → batches are
large (and batch.size is effectively not the cap).
- Add brokers → cluster responds faster → slots free up → producer
dispatches at batch.size again → batches shrink to the
configured ceiling.
More cluster capacity leads to smaller batches (and therefore more request rate per unit of record rate). This is the opposite of the typical intuition that "adding capacity makes everything better"; in this case it shifts the bottleneck resource but leaves effective batching partially worse.
Why this is not an observability problem¶
A common debugging trap: engineers see batches larger than
batch.size and conclude the setting isn't working. It is — the
broker-side saturation is in command, not the producer-side
threshold. The fix is at the broker (capacity) or at the pipeline
(reduce producer rate), not at the producer config.
Relationship to classic backpressure¶
Classic backpressure is a slow-down signal — a slow consumer makes the fast producer stop. This is the inverse: the producer doesn't slow down, it grows its batches. The observable effect is the same (throughput adjusts to cluster capacity), but the mechanism is different:
- Classic backpressure: producer is explicitly blocked waiting for capacity.
- Kafka producer backpressure: producer continues accepting records but inflates batches because dispatch is stalled.
The client's ability to keep accepting records is bounded by
buffer.memory — once total open-batch memory exceeds
buffer.memory, classic blocking backpressure kicks in
(producer thread blocks on send() or records get dropped).
Seen in¶
- sources/2024-11-19-redpanda-batch-tuning-in-redpanda-for-optimized-performance-part-1 — canonical wiki source.
- sources/2024-11-26-redpanda-batch-tuning-in-redpanda-to-optimize-performance-part-2
— operations-manual companion that canonicalises the
scheduler queue length private metric
(
vectorized_scheduler_queue_length) as the broker-side observable that confirms the backpressure regime. Scheduler backlog correlating with request volume is the signal that the producer's max-in-flight slots are saturating and backpressure-inflated batches are forming.
Related¶
- systems/kafka, systems/redpanda — Kafka-API clients implement this.
- concepts/effective-batch-size — factor 7 in the seven- factor framework.
- concepts/batching-latency-tradeoff — the normal-vs-saturated regime framing.
- concepts/backpressure — the adjacent generic concept.
- patterns/batch-over-network-to-broker — the producer-side pattern.