Skip to content

CONCEPT Cited by 1 source

Offset-commit cost

Definition

Offset-commit cost names the often-overlooked broker-side tax that consumer groups pay every time they commit their progress. A consumer group's committed offset per partition is stored in the internal __consumer_offsets topic; each commit is therefore a produce write to that topic, handled by the group coordinator broker.

A consumer reading at rate R and committing every C seconds adds R-agnostic commit-write pressure at rate partitions / C writes per consumer group. Across many consumer groups with frequent commits, the aggregate load on __consumer_offsets and the group coordinators can rival application-topic traffic.

Redpanda's save-button analogy (Source: sources/2025-04-23-redpanda-need-for-speed-9-tips-to-supercharge-redpanda):

"When consuming a topic, committing your consumer group offsets is exactly like pressing the save button. You record where you've read to, and just like that save, each commit takes time and resources. If you commit too frequently, your consumer will be less efficient, but it can also start to impact the broker as your consume workload gradually transforms (somewhat unknowingly) into a consume AND produce workload, since each read is accompanied by a commit write."

The read-becomes-write transformation

The mechanical chain:

consumer.poll()
  → broker returns batch of records
  → application processes records
  → consumer.commit()
    → consumer sends commit RPC to group coordinator
    → coordinator produces offset record to __consumer_offsets
    → coordinator acks commit to consumer

Every "consume event" implicitly triggers a "produce event" — the consumer load is actually consume + produce, weighted by commit frequency. For a consumer committing every 100 ms, the broker sees 10 commit writes per second per partition assigned to that consumer.

The knobs

enable.auto.commit (default true)

If true, the consumer commits in the background every auto.commit.interval.ms. The default auto-commit interval is 5 seconds — usually a reasonable starting point. The post's advice:

"If using auto commit, set auto.commit.interval.ms to a reasonable value. Generally, one second or higher; the default is 5 seconds. Low milliseconds is right out!"

Low-ms auto-commit is a common anti-pattern; developers reach for it thinking they're minimising re-read on restart. They are — but the broker tax is severe.

Manual commit

If enable.auto.commit=false, the consumer calls consumer.commitSync() or consumer.commitAsync() explicitly. The discipline:

"If manually committing in your application code, try to align your implied commit frequency to at least one second."

One consumer group per application

Shared consumer groups across services create group-coordinator contention and rebalance storms. Verbatim:

"Make sure each application or micro-service uses its own consumer group, otherwise your applications can inadvertently increase the load on a single group coordinator and make rebalances more costly and impactful."

Commit frequency = RPO dial

The consumer-side connection to RPO (recovery point objective):

On consumer restart, a consumer re-reads from its last committed offset. If the consumer was committing every C seconds and crashes, up to C seconds of records must be re-read. Commit frequency is the RPO for the consume side:

Commit interval Worst-case re-read on crash
100 ms 100 ms
1 s 1 s
5 s (default) 5 s
60 s 60 s

Redpanda's verbatim framing:

"In a Disaster Recovery (DR) context, be aware of your Recovery Point Objectives (RPOs) and use those to help define your minimum commit frequency."

The corollary: if the application tolerates 10 s of re-read on crash (idempotent processing, deduplicated downstream), auto.commit.interval.ms=10000 is safe — and dramatically cheaper than the default 5 s.

Why "few seconds of re-read" is usually fine

The post's load-bearing argument against over-committing:

"Many folks try to commit excessively often to minimize re-reads during an application restart. While that initially sounds plausible, re-reading some amount of data occasionally is expected for most streaming applications, so if your application already has to cope with re-reading a few milliseconds of messages, it can probably cope with a few seconds worth."

Most streaming applications already tolerate some re-read (at-least-once delivery semantics, producer retries after broker timeouts, consumer rebalances). Extending that re-read window from a few ms to a few s is usually free — and buys an order-of-magnitude reduction in commit rate.

Quantifying the broker impact

For a consumer group with P assigned partitions committing every C seconds, the commit rate is P/C writes per second per group.

A hundred consumer groups each with 32 partitions committing every 100 ms: - Commit rate: 100 × 32 / 0.1 = 32,000 writes/sec to __consumer_offsets.

Same workload committing every 5 s: - Commit rate: 100 × 32 / 5 = 640 writes/sec.

50× reduction in commit load for a 50× longer potential re-read window. For most workloads, that's a trivially worthwhile trade.

Seen in

Last updated · 470 distilled / 1,213 read