CONCEPT Cited by 1 source
Offset-preserving replication¶
Definition¶
Offset-preserving replication is cross-cluster replication where
the destination cluster holds the same per-partition offsets as
the source cluster. A record written at offset N on the source
is reachable at offset N on the destination. Consumers that fail
over from source to destination resume at the same offsets they
held on the source, without an offset-translation map or
re-snapshot.
Offset preservation is a structural property that decides how expensive consumer failover is during disaster recovery. Without it, each consumer group needs an external map that records "offset X on source corresponds to offset Y on destination" — the map has to be kept in sync with the replication stream and consulted at failover time, adding a subsystem to the DR critical path.
Canonical wiki source¶
Introduced in the Redpanda 25.3 launch post as the load-bearing property of Redpanda's new Shadowing feature:
"Shadowing combines an asynchronous replication mechanism with offset preservation, allowing for multi-region disaster recovery with simpler client failover procedures."
"Shadowing creates a fully functional, hot-standby clone of your entire Redpanda cluster — topics, configs, consumer group offsets, ACLs, schemas — the works!"
The shadow cluster is "byte-for-byte, offset-preserving" — the full disclosure is in the sources/2025-11-06-redpanda-253-delivers-near-instant-disaster-recovery-and-more|25.3 post.
Contrast with MirrorMaker2¶
MirrorMaker2 (MM2) does not offset-preserve. MM2 runs as a Kafka Connect source + destination connector pair that consumes the source cluster's topics and produces the records to the destination cluster — the destination cluster assigns its own offsets on ingest, which are independent of the source's.
To make MM2-replicated data usable for failover, MM2 maintains a per-consumer-group offset translation map in a separate Kafka topic:
- The source topic's consumer-group commits are mirrored to the destination.
- MM2 writes
(source-offset, destination-offset)translations to a__consumer_offsets-equivalent on the destination. - At failover, the consumer reads its last-committed source offset, looks up the translated destination offset, and resumes there.
This works but adds three kinds of operational cost:
- Translation map lag — the map may be stale at the exact moment of failover, forcing the consumer to replay or skip.
- Client-side translation awareness — consumers need MM2- compatible offset-reset logic; stock Kafka consumers don't know about the map.
- Offset-numbering divergence — after a failover the destination becomes the source for the return leg; offsets numbers have drifted.
Offset preservation removes all three costs. A consumer that knows its last source offset resumes at the same offset on the destination, full stop. This is Shadowing's canonical client-side simplification over MM2.
When it's feasible¶
Offset-preserving replication requires the destination cluster to accept records with externally-determined offsets rather than assign its own. This is a broker-internal capability — the standard Kafka producer API assigns offsets on produce; a broker that imports records and writes them into the same offset slot the source used needs to be doing so at a layer below the producer API.
This is why Shadowing is a broker- internal mechanism, not a Kafka Connect connector — a connector that produces records via the public API cannot preserve source offsets, regardless of how fancy its bookkeeping is. The feature has to live inside the broker's log layer.
Why offset preservation matters for streaming DR¶
Seconds-scale DR recovery times hinge on how fast consumers can resume:
- With translation: restart consumers → block on map lookup → resolve potentially-stale translation → resume at approximate offset → potentially re-process or skip a window of records.
- With offset preservation: restart consumers → point them at the shadow cluster → resume at the exact committed offset.
The difference is seconds and no data-processing ambiguity vs longer + non-zero window of replay/skip. For latency- sensitive consumers (real-time pipelines, reactive agents, dashboards) this is the load-bearing DR-readiness property.
Seen in¶
- sources/2025-11-06-redpanda-253-delivers-near-instant-disaster-recovery-and-more — canonical wiki source introducing offset-preserving replication as the structural property of Redpanda Shadowing that separates it from MirrorMaker2 and the prior Redpanda Migrator connector. Paired with broker- internal replication and async replication as the three defining properties of Shadowing.
Related¶
- systems/redpanda-shadowing — the canonical instance.
- systems/redpanda — the broker implementing the property.
- systems/kafka — the wire protocol being preserved through.
- concepts/mirrormaker2-async-replication — the non- offset-preserving alternative.
- concepts/asynchronous-replication — the replication mode offset preservation composes with.
- concepts/rpo-rto — the DR budget offset preservation shrinks the consumer-side contribution to.
- concepts/broker-internal-cross-cluster-replication — the structural property that makes offset preservation feasible.
- patterns/offset-preserving-async-cross-region-replication — the composed pattern.
- patterns/hot-standby-cluster-for-dr — the DR shape offset preservation slots into.