Skip to content

CONCEPT Cited by 1 source

Pass-through write to object store

Definition

Pass-through write to object store is the storage-path shape where a streaming broker receives a message, writes the message payload directly to object storage (S3 / GCS / ADLS) without first persisting it to a local disk, while retaining metadata + consensus state on local NVMe for correctness and performance. The broker acts as a protocol-terminating proxy that routes bulk bytes to cheap durable storage and keeps only the small-but-latency-sensitive coordination state locally.

Canonicalised as the core architectural primitive behind Redpanda Cloud Topics in the Redpanda 26.1 launch post (Source: sources/2026-03-31-redpanda-261-delivers-the-industrys-first-adaptable-streaming-engine):

"Cloud Topics use a pass-through write model that saves the bulk of your messages directly to object storage. We stream the heavy message payloads directly to S3 or GCS, but we keep the brains of the operation—the metadata and Raft consensus—on high-performance local NVMe."

The layer split

Pass-through write divides the streaming-broker storage path into two layers with different cost/latency profiles:

Layer Substrate Size Latency profile
Message payload (bulk bytes) Object storage 99%+ of bytes Tolerant: multi-ms PUT/GET is fine
Metadata + Raft consensus Local NVMe <1% of bytes Intolerant: must be sub-ms

The metadata layer covers partition leadership state, Raft consensus log entries, topic configurations, ACLs, consumer group offsets, and transaction markers. These are latency-critical for leader elections, transaction commits, and consumer-group rebalances. The payload layer is the actual producer-sent bytes, which dominate storage volume but tolerate object-storage-grade latency for writes and reads on latency- tolerant topics.

Distinguishing from tiered storage

Pass-through write is structurally distinct from conventional tiered storage:

Axis Tiered storage Pass-through write
Hot data location Local NVMe Object storage
Cold data location Object storage Object storage
Cross-AZ replication on hot tier Yes (RF-1 cross-AZ copies) No (object store inherits durability)
Broker disk usage Grows with retention Near-constant (metadata only)
Primary read-path NVMe → object store fallback Object store always

Tiered storage keeps hot data on NVMe and offloads old data asynchronously to object storage. Pass-through write writes to object storage first, and keeps NVMe only for the small metadata + consensus state. This eliminates cross-AZ replication cost (see concepts/cross-az-replication-bandwidth-cost) but introduces object-storage write latency on the hot write-path.

See concepts/tiered-storage-as-primary-fallback for the primary-fallback variant where NVMe is authoritative and object storage is asynchronous.

Distinguishing from diskless

Pass-through write is also distinct from whole-cluster diskless architectures (WarpStream):

Axis Diskless Pass-through (disk-lite)
Message payload Object storage Object storage
Metadata External metadata store / coordination service Local NVMe via Raft
Raft consensus N/A (different model) Local NVMe
Transaction support Often limited / disabled Full (Raft-consistent)
External dependencies Metadata store, coordination service None (beyond object store)

Redpanda's explicit framing ("diskless isn't riskless") points at three structural advantages of pass-through-write over fully diskless:

  1. No broken transactions — Raft consensus on local NVMe supports Kafka-style transactions.
  2. No metadata lag — metadata is in-broker, not in a remote service.
  3. No external control plane dependencies — the broker is self-contained.

Cost structure

The economic rationale is cross-AZ bandwidth elimination:

  • Traditional streaming: every byte replicated to RF-1 other AZs → cross-AZ transfer billed per byte → dominant line item.
  • Pass-through write: every byte goes directly to object storage, whose multi-AZ durability is amortised at storage-service pricing → no per-byte cross-AZ bill.

Vendor claim (Redpanda 26.1): "over 90% lower networking costs" for latency-tolerant workloads.

Why the metadata layer still needs NVMe

The architectural question pass-through write answers is: can we avoid cross-AZ replication on the hot path without sacrificing streaming correctness?

Moving metadata to object storage would:

  • Incur object-store latency on every leader election.
  • Incur object-store latency on every transaction commit.
  • Break latency-sensitive workloads that ride on fast consensus.

Keeping metadata on local NVMe preserves:

  • Single-digit-ms leader elections.
  • Sub-ms transaction-commit latency.
  • Strong consistency via Raft quorum on the metadata log.

The result is a hybrid storage shape: cheap for bulk bytes, fast for coordination.

Seen in

Last updated · 470 distilled / 1,213 read