Skip to content

CONCEPT Cited by 1 source

Transaction control batch

Definition

A transaction control batch is a special record appended to a Kafka partition's log by the transaction coordinator to mark the outcome of a transactional write. Two types exist:

  • COMMIT — signals that the preceding transactional records from this producer should be delivered to read_committed consumers.
  • ABORT — signals that the preceding transactional records should be hidden from read_committed consumers.

In a transactional write, the producer first writes data records (possibly across several partitions), then the coordinator appends a COMMIT or ABORT control batch to each involved partition.

Role in consumer isolation

Consumers running with isolation.level=read_committed use control batches to decide whether to deliver a transaction's records:

  • Records between a transaction's first data batch and its COMMIT marker are delivered.
  • Records between a transaction's first data batch and its ABORT marker are suppressed.
  • Records whose transaction has no visible control batch yet remain invisible — the consumer's read position is pinned at the Last Stable Offset (LSO).

Compaction lifecycle

Control batches sit in the log like ordinary records. In compacted topics, they follow expiration-based cleanup:

  1. After delete.retention.ms, the marker batch is replaced by an empty batch that still carries the producer ID and COMMIT/ABORT flag in its header.
  2. After producer.id.expiration.ms (timed from last producer activity), the empty batch may also be discarded.

"Tombstones and COMMIT/ABORT control batches are the only signals that their associated records were deleted, committed, or aborted, respectively. Once a tombstone or a control batch is compacted away, this information is gone." (Source: sources/2026-06-25-redpanda-kafkas-log-compaction-corrupts-data)

Failure modes when lost

If a control batch is compacted away before a lagging replica sees it, three failure modes arise (see concepts/compaction-replication-race):

  • Lost ABORT → aborted data served as committed (the next COMMIT from the same producer is applied retroactively)
  • Lost COMMIT → committed data reclassified as aborted (the next ABORT is applied retroactively)
  • Lost COMMIT + empty-batch remnant → partition frozen at stale LSO for read_committed consumers

Seen in

Last updated · 559 distilled / 1,651 read