CONCEPT Cited by 1 source
Pass-through write to object store¶
Definition¶
Pass-through write to object store is the storage-path shape where a streaming broker receives a message, writes the message payload directly to object storage (S3 / GCS / ADLS) without first persisting it to a local disk, while retaining metadata + consensus state on local NVMe for correctness and performance. The broker acts as a protocol-terminating proxy that routes bulk bytes to cheap durable storage and keeps only the small-but-latency-sensitive coordination state locally.
Canonicalised as the core architectural primitive behind Redpanda Cloud Topics in the Redpanda 26.1 launch post (Source: sources/2026-03-31-redpanda-261-delivers-the-industrys-first-adaptable-streaming-engine):
"Cloud Topics use a pass-through write model that saves the bulk of your messages directly to object storage. We stream the heavy message payloads directly to S3 or GCS, but we keep the brains of the operation—the metadata and Raft consensus—on high-performance local NVMe."
The layer split¶
Pass-through write divides the streaming-broker storage path into two layers with different cost/latency profiles:
| Layer | Substrate | Size | Latency profile |
|---|---|---|---|
| Message payload (bulk bytes) | Object storage | 99%+ of bytes | Tolerant: multi-ms PUT/GET is fine |
| Metadata + Raft consensus | Local NVMe | <1% of bytes | Intolerant: must be sub-ms |
The metadata layer covers partition leadership state, Raft consensus log entries, topic configurations, ACLs, consumer group offsets, and transaction markers. These are latency-critical for leader elections, transaction commits, and consumer-group rebalances. The payload layer is the actual producer-sent bytes, which dominate storage volume but tolerate object-storage-grade latency for writes and reads on latency- tolerant topics.
Distinguishing from tiered storage¶
Pass-through write is structurally distinct from conventional tiered storage:
| Axis | Tiered storage | Pass-through write |
|---|---|---|
| Hot data location | Local NVMe | Object storage |
| Cold data location | Object storage | Object storage |
| Cross-AZ replication on hot tier | Yes (RF-1 cross-AZ copies) | No (object store inherits durability) |
| Broker disk usage | Grows with retention | Near-constant (metadata only) |
| Primary read-path | NVMe → object store fallback | Object store always |
Tiered storage keeps hot data on NVMe and offloads old data asynchronously to object storage. Pass-through write writes to object storage first, and keeps NVMe only for the small metadata + consensus state. This eliminates cross-AZ replication cost (see concepts/cross-az-replication-bandwidth-cost) but introduces object-storage write latency on the hot write-path.
See concepts/tiered-storage-as-primary-fallback for the primary-fallback variant where NVMe is authoritative and object storage is asynchronous.
Distinguishing from diskless¶
Pass-through write is also distinct from whole-cluster diskless architectures (WarpStream):
| Axis | Diskless | Pass-through (disk-lite) |
|---|---|---|
| Message payload | Object storage | Object storage |
| Metadata | External metadata store / coordination service | Local NVMe via Raft |
| Raft consensus | N/A (different model) | Local NVMe |
| Transaction support | Often limited / disabled | Full (Raft-consistent) |
| External dependencies | Metadata store, coordination service | None (beyond object store) |
Redpanda's explicit framing ("diskless isn't riskless") points at three structural advantages of pass-through-write over fully diskless:
- No broken transactions — Raft consensus on local NVMe supports Kafka-style transactions.
- No metadata lag — metadata is in-broker, not in a remote service.
- No external control plane dependencies — the broker is self-contained.
Cost structure¶
The economic rationale is cross-AZ bandwidth elimination:
- Traditional streaming: every byte replicated to RF-1 other AZs → cross-AZ transfer billed per byte → dominant line item.
- Pass-through write: every byte goes directly to object storage, whose multi-AZ durability is amortised at storage-service pricing → no per-byte cross-AZ bill.
Vendor claim (Redpanda 26.1): "over 90% lower networking costs" for latency-tolerant workloads.
Why the metadata layer still needs NVMe¶
The architectural question pass-through write answers is: can we avoid cross-AZ replication on the hot path without sacrificing streaming correctness?
Moving metadata to object storage would:
- Incur object-store latency on every leader election.
- Incur object-store latency on every transaction commit.
- Break latency-sensitive workloads that ride on fast consensus.
Keeping metadata on local NVMe preserves:
- Single-digit-ms leader elections.
- Sub-ms transaction-commit latency.
- Strong consistency via Raft quorum on the metadata log.
The result is a hybrid storage shape: cheap for bulk bytes, fast for coordination.
Seen in¶
- sources/2026-03-31-redpanda-261-delivers-the-industrys-first-adaptable-streaming-engine — canonical wiki source. Redpanda 26.1 GAs Cloud Topics with the "pass-through write model" name as its core architectural primitive.
Related¶
- systems/redpanda — the broker hosting pass-through-write topics.
- systems/redpanda-cloud-topics — the feature that implements pass-through write.
- systems/aws-s3, systems/google-cloud-storage — the backing object stores.
- systems/warpstream — the whole-cluster-diskless shape pass-through write differentiates against.
- concepts/cross-az-replication-bandwidth-cost — the cost axis pass-through write attacks.
- concepts/tiered-storage-as-primary-fallback — the NVMe-primary alternative model.
- concepts/compute-storage-separation — the broader architectural pattern pass-through write is a streaming- broker instantiation of.
- patterns/per-topic-storage-tier-within-one-cluster — the composition pattern that picks pass-through-write per topic.
- patterns/diskless-disk-lite-hybrid-streaming — the named pattern.
- patterns/tiered-storage-to-object-store — the broader pattern family.