PATTERN Cited by 1 source
Offset-preserving async cross-region replication¶
Pattern¶
Maintain a hot-standby clone of a streaming cluster in a second region by asynchronously replicating every record with the source's offsets preserved. At DR-failover time, consumers resume at the same offsets they held on the source — no offset-translation map, no consumer re-snapshot.
The pattern composes three properties:
- Asynchronous replication — async between two independent clusters, each with local per-region write latency. Per-write cost is not a cross-region RTT.
- Offset preservation — concepts/offset-preserving-replication. The destination holds records at the source-assigned offsets, not at destination-assigned offsets.
- Broker-internal implementation — concepts/broker-internal-cross-cluster-replication. The replication mechanism lives inside the broker's log layer, not in a separate Kafka Connect connector (which cannot preserve offsets because it produces via the public API).
The combination is stronger than any of the three alone. Async by itself is the MirrorMaker2 shape, which sacrifices offset preservation. Offset preservation without broker-internal implementation isn't feasible — the public Kafka producer API doesn't accept externally-supplied offsets. Broker-internal by itself is a design property, not a DR shape.
Canonical instance¶
Redpanda Shadowing (25.3, 2025-11-06) is the first and (at publication) only wiki instance.
From the sources/2025-11-06-redpanda-253-delivers-near-instant-disaster-recovery-and-more|25.3 launch post:
"Shadowing combines an asynchronous replication mechanism with offset preservation, allowing for multi-region disaster recovery with simpler client failover procedures."
"Shadowing is built into the Redpanda broker itself and uses the standard Kafka API to link clusters. No MirrorMaker 2 or Redpanda Migrator connectors are used under the hood."
Position on the DR axis¶
Two other shapes occupy adjacent slots on Redpanda's DR axis:
| Shape | Replication mode | Offsets | RPO | RTO | Operator surface |
|---|---|---|---|---|---|
| Stretch cluster | Sync (Raft across regions) | Same cluster — trivially preserved | 0 | Seconds (Raft re-election) | Single cluster |
| Offset-preserving async cross-region (this pattern) | Async | Preserved | Seconds | Seconds (client timeout-bound) | Two clusters, one feature |
| MM2 async | Async | Translated | Seconds to minutes (lag-dependent) | Seconds to minutes + translation-map lookup | Two clusters + Kafka Connect |
The slot this pattern occupies is specifically: "seconds RPO/RTO without the connector overhead" — it's the DR shape customers with tight RPO/RTO goals but real per-write latency constraints pick, now that the broker-internal mechanism exists.
Client-failover procedure¶
With offset preservation, the failover mechanics simplify to:
- Repoint consumers at the shadow cluster's bootstrap endpoints.
- Consumers resume at their last committed offsets — which are valid on the shadow cluster because offsets are preserved.
- Repoint producers at the shadow cluster's bootstrap endpoints.
No offset-translation-map lookup, no consumer-group re-snapshot, no consumer-lag spike from mis-translated offsets. The recovery time is bounded by the client timeout — verbatim from the launch post: "RTOs … limited only by timeout settings for producers and consumers."
Cost¶
Two clusters' worth of compute + storage + cross-region replication bandwidth. Same order-of-magnitude as MM2. The efficiency win is operator time and complexity, not compute.
Trade-offs vs stretch cluster¶
Pick this pattern when:
- Cross-region per-write RTT is unacceptable for the latency SLA (streaming writes that can't tolerate 30-150 ms per-write cross-region RTT).
- Per-region write availability matters (each cluster stays writable even under partition).
- The operational model already supports two regional clusters.
Pick stretch cluster when:
- RPO = 0 is a hard requirement.
- Single-control-plane operations outweigh per-write latency cost.
- Cross-region RTT fits within the write latency budget.
Seen in¶
- sources/2025-11-06-redpanda-253-delivers-near-instant-disaster-recovery-and-more — canonical wiki source. Redpanda Shadowing is introduced as the first broker-native instantiation of this three-property composition on the streaming-broker substrate, positioned explicitly against MirrorMaker2 (offset-translated) and the stretch cluster (sync).
Related¶
- systems/redpanda-shadowing — the canonical instance.
- systems/redpanda — the broker.
- systems/kafka — the wire protocol.
- concepts/offset-preserving-replication — property 2.
- concepts/broker-internal-cross-cluster-replication — property 3.
- concepts/mirrormaker2-async-replication — the translated-offsets alternative.
- concepts/asynchronous-replication — property 1.
- concepts/rpo-rto — the DR budget dimension this pattern shrinks on both axes.
- patterns/hot-standby-cluster-for-dr — the general pattern family.
- patterns/async-replication-for-cross-region — the broader pattern class.
- companies/redpanda — the company shipping the instance.