CONCEPT Cited by 1 source
Remote read replica topic¶
Definition¶
A remote read replica topic is a read-only topic on a separate cluster that mirrors a topic on an origin cluster by reading the origin's tiered-storage / archival-storage segments directly from object storage (S3, GCS, Azure Blob) — bypassing the origin cluster's brokers entirely. Consumers subscribe to the remote cluster's topic; the remote cluster fetches segments from the shared object store; the origin cluster's broker fleet experiences no read load from the remote consumers.
This is the read-fan-out-decoupled-from-origin primitive on tiered-storage-capable streaming brokers like Redpanda.
Canonical Redpanda framing¶
"A Remote Read Replica topic is a read-only topic that mirrors a topic on a different cluster. It works with both Tiered Storage and archival storage."
"Remote Read Replicas allow you to create a separate remote cluster for consumers of a specific topic, populating its topics from remote storage. This can serve consumers without increasing the load on the origin cluster. These read-only topics access data directly from object storage instead of the topics' origin cluster, which means there's no impact on the performance of the original cluster. Topic data can be consumed within a region of your choice, regardless of where it was produced."
(Source: sources/2025-02-11-redpanda-high-availability-deployment-multi-region-stretch-clusters)
How it differs from follower fetching¶
concepts/follower-fetching optimises read-path locality by letting consumers read from the origin cluster's follower broker rather than leader. A remote read replica topic goes one step further: the consumer reads from a separate cluster entirely, backed by the shared object store.
| Follower fetching | Remote read replica | |
|---|---|---|
| Cluster | Same origin cluster | Separate remote cluster |
| Data source | Origin follower broker | Object storage (S3/GCS) |
| Origin broker load | Reduced (reads go to followers) | Zero |
| Staleness | Replica lag (ms) | Object-storage upload interval (seconds) |
| Scale-out ceiling | Bounded by replication factor | Unbounded — more read clusters, more read throughput |
| Cross-region application | Yes, but still loads origin cluster | Yes, and isolates origin from remote reads entirely |
The architectural difference is read-load scale-out vs scale- up. Follower fetching scales reads across the origin's existing follower brokers but does not add read throughput beyond the cluster's own capacity. Remote read replica adds a separate read cluster in the read-heavy region, scaling read fan-out without adding origin brokers.
Architectural substrate: tiered storage¶
Remote read replica is built on tiered storage — the origin cluster already offloads historical log segments to object storage (S3/GCS) for cost and retention reasons. A remote read replica cluster reads those same segments from the object store directly, without needing to communicate with the origin's brokers. The segments become a de-facto shared read substrate between origin and remote clusters.
Because object-storage uploads happen on a segment-close cadence (segments are written to local NVMe first, then uploaded when closed or compacted), the remote read replica lags the origin by roughly one segment interval — typically seconds. This makes remote read replica unsuitable for real-time consumers, but well suited to read-heavy archival / analytical workloads where the origin cluster shouldn't be loaded.
Distinction from MirrorMaker2¶
MirrorMaker2 runs a pull process between two independent clusters, consuming from the source cluster's brokers and producing to the destination cluster's brokers. This pays double-handling on both sides and loads both clusters' broker fleets.
Remote read replica avoids the broker-to-broker copy entirely — the object-storage segments are the mirror, and the remote cluster only needs to read them. The origin cluster's broker fleet is untouched. This is a substantially cheaper fan-out mechanism when tiered storage is already deployed.
Seen in¶
- sources/2025-02-11-redpanda-high-availability-deployment-multi-region-stretch-clusters
— canonical wiki definition; positioned as the fourth mitigation
for stretch-cluster cross-region cost alongside leader pinning,
acks=1, and follower fetching.
Related¶
- systems/redpanda, systems/kafka
- concepts/multi-region-stretch-cluster — the shape remote read replica decouples read load from.
- concepts/follower-fetching — the intra-cluster analogue.
- concepts/leader-follower-replication — the replication shape remote read replica sidesteps.
- concepts/mirrormaker2-async-replication — the broker-to- broker alternative.
- patterns/async-replication-for-cross-region — the pattern class remote read replica offers an object-storage-backed instantiation of.