Skip to content

CONCEPT Cited by 1 source

Placeholder batch (metadata in Raft)

Definition

A placeholder batch is a small, metadata-only record replicated through a streaming broker's Raft log whose payload is a pointer (object-storage URL, file offset, byte range) to the actual record bytes stored elsewhere — typically in object storage. The placeholder is what crosses the consensus boundary; the data itself does not.

The pattern lets a streaming broker decouple the data plane (bytes in S3/GCS/ADLS) from the metadata plane (pointers in Raft) while preserving the broker's existing transactional and idempotency semantics, because every producer write still flows through the same Raft-replicated produce path — just carrying a pointer instead of bytes.

Canonical wiki instance

Introduced on the wiki from the 2026-03-30 Redpanda Cloud Topics architecture deep-dive:

"Once the L0 file is safely durable in the cloud, we replicate a placeholder batch containing the location of the data to the corresponding Raft log for each batch involved in the upload."

"Because we still use the Raft log for this metadata, Cloud Topics inherit the same transaction and idempotency logic as our standard topics. The data payload lives in the cloud, but the guarantees live in Redpanda."

The placeholder batch is what makes Redpanda Cloud Topics a drop-in Kafka topic class rather than a separate API: producers see identical produce semantics, consumers see identical fetch semantics, transactional coordinators see identical offset metadata — the only difference is that the bytes for produced records live in object storage instead of NVMe.

Why this works

Kafka's transactional and idempotency protocols (acks/durability, idempotent producers, transactional producers, exactly-once semantics) are built around the ordering and atomicity of records written to a partition. What they fundamentally require is:

  1. A per-partition monotonic offset assignment.
  2. A way to abort/commit a set of records atomically.
  3. A way to dedupe records by (producerId, epoch, sequence).

All three are properties of the record metadata (offsets, sequence numbers, transactional control records) — none require the record payload bytes to be on the broker's disk. As long as the metadata flows through the same Raft log in the same order, the broker can enforce the guarantees.

The placeholder batch embodies this observation: replace the payload with a pointer, keep everything else identical.

Contrast with payload-in-Raft shapes

Shape Record payload replication Record metadata replication Cross-AZ bandwidth cost
Standard Kafka topic Raft / ISR Raft / ISR Yes — RF-1 payload copies
Redpanda standard topic Raft Raft Yes — RF-1 payload copies
Redpanda tiered-storage topic Raft on hot tier Raft Yes — hot-tier only
Redpanda Cloud Topic Object storage PUT Placeholder batch via Raft No (amortised into object-store PUT)
WarpStream-style Object storage PUT External metadata store No

The placeholder-batch shape is distinct from fully stateless (WarpStream-style) topics: metadata still lives in-broker and is still replicated by the cluster's consensus layer, so the broker retains authoritative knowledge of what records exist — only the payload is externalised. This is what preserves data-plane atomicity from Redpanda's perspective: the Raft log is still the single source of truth for a partition's contents.

Relationship to "log as truth, data as cache"

The placeholder-batch mechanism is a concrete instance of the log-as-truth pattern where the Raft log of pointers is truth and the object-storage payload is addressable cache. If the object- storage blob disappeared but its placeholder survived, the broker would know a record existed at that offset but could not serve its bytes — operationally equivalent to a cache miss with unrecoverable data. If the placeholder was never committed, the record "doesn't exist" even if the bytes are in object storage — the log, not the storage, defines membership.

Operational shape

From the Cloud Topics architecture post:

  1. Producer sends records; broker batches them in memory across partitions / topics.
  2. Broker flushes the batch to object storage as an L0 file (see concepts/l0-l1-file-compaction-for-object-store-streaming).
  3. After the L0 file is durable, broker replicates a placeholder batch per involved partition through its Raft log, containing the object-storage location.
  4. Once the placeholder batch is committed by Raft, broker acks the producer.

The acknowledgment sequence means durability is established by the object-storage PUT completing and the placeholder Raft commit succeeding — both are required.

Seen in

Last updated · 470 distilled / 1,213 read