Skip to content

PATTERN Cited by 1 source

Reciprocal active-passive via parallel shadow links

Pattern

Deploy two streaming clusters in two regions, each configured with a single unidirectional shadow link reading from the other, such that:

  • Each cluster owns a disjoint subset of topics (prefix- convention-named) and is the sole writer to them.
  • Each cluster shadows the other cluster's topics as read- only replicas.
  • Failover in either direction (A → B or B → A) promotes the failed cluster's shadowed topics to writable on the surviving cluster.

The result is a fault-tolerant pair where each cluster does real work locally (per-region producer/consumer traffic for its own topics) and stands ready to take over the other's workload on failover.

Canonicalised by the 2026-04-21 Redpanda Shadow Linking deep-dive:

"Running a reciprocal active-passive cluster pair is as simple as configuring two shadow links — one on each cluster."

"This kind of reciprocal active-passive architecture, in which both clusters are active and usable, can still be achieved with parallel shadow links."

Why this pattern instead of uni-directional hot-standby

The canonical single-direction hot-standby shape has one primary cluster doing all the work and one secondary cluster sitting idle as a shadow. This wastes the secondary's compute.

Reciprocal active-passive gets both clusters working without crossing into the write-conflict territory of true active-active (multi-writer-same-topic):

  • Hardware utilisation: both clusters carry real workloads; neither sits idle waiting for a failover.
  • Per-region write latency: producers in region A write to cluster A (local) rather than a distant primary in region B.
  • Bidirectional failover: either cluster can save the other; the DR shape doesn't have a "primary" and "backup" asymmetry.
  • No write conflicts: each topic still has a single writer by construction (the owning cluster); the reciprocal shape never tries to merge conflicting writes.

Composition

Three subsystems compose to produce the pattern:

  • Cluster B has a shadow link reading from cluster A → all a_* topics flow A → B.
  • Cluster A has a shadow link reading from cluster B → all b_* topics flow B → A.

Each link is independent. Each uses offset-preserving, broker-internal, async replication per the Shadow Linking mechanism.

2. [Topic-

prefix namespacing](<../concepts/topic-prefix-namespacing-convention.md>)

Topics and consumer groups are named with a prefix encoding their origin cluster (a_*, b_*, or region codes, DC codes, etc.). This:

  • Prevents name collisions between the two clusters.
  • Allows the shadow link configuration to be a prefix rule rather than an explicit topic list, so new topics are automatically included.
  • Makes origin-cluster visible in every topic name at a glance.

3. Consumer-side dual-subscription

Consumers that need to see the full logical dataset subscribe to both their cluster's locally-owned topics and the shadow copies of the other cluster's topics. The 2026-04-21 post:

"Consuming messages is conceptually a little more complex, in that there are now two topics that need to be read by the same consumer group (local and shadow). In practice, this just means a little more configuration of the consuming client."

Worked example

Region A (cluster A)                 Region B (cluster B)
──────────────────                   ──────────────────
a_orders       (writable)    ═══▶    a_orders       (shadow, read-only)
a_shipments    (writable)    ═══▶    a_shipments    (shadow, read-only)
b_inventory    (shadow, RO)  ◀═══    b_inventory    (writable)
b_payments     (shadow, RO)  ◀═══    b_payments     (writable)

Producers in region A write to a_*
Producers in region B write to b_*
Consumers in region A subscribe to a_* + b_* (via shadows)
Consumers in region B subscribe to b_* + a_* (via shadows)

Failover A → B:

  1. a_orders and a_shipments shadows on B become writable.
  2. Producers for a_* reconfigure to point at B.
  3. Consumers for a_* either already read from B (via the shadow) or reconfigure to B (if they were reading from A directly).
  4. The B-direction shadow link keeps running (B → A replication is unaffected, though A's consumers of B topics are down until region A recovers).

Schema-registry primary-site asymmetry

One constraint named by the 2026-04-21 post:

"A primary site for schema registry would need to be chosen (since both sites will use _schemas)."

Because Redpanda's schema registry is stored in a topic named _schemas, both sites can't own a locally-writable _schemas simultaneously. One site is the schema-registry primary; the other replicates _schemas as a shadow. This is the one asymmetry in an otherwise symmetric reciprocal topology — schema-registry failover is a distinct step from topic failover.

Trade-offs

Dimension Reciprocal active-passive Uni-directional hot-standby
Hardware utilisation Full (both clusters work) Half (secondary idle)
Per-region write latency Local Local for primary region, remote for secondary
Operational complexity Higher (two shadow links + prefix discipline + consumer dual-subscription) Lower (one link)
Schema registry One site must be primary No conflict (primary is the only writer)
Failover direction Either direction One direction only
Write-conflict risk None (single writer per topic) None
Blast radius of a failover Per-cluster's-topic-family Whole workload

Pick reciprocal active-passive when:

  • Hardware cost of an idle secondary cluster is unacceptable.
  • Both regions have meaningful local workloads worth pinning to their own cluster.
  • The operational team can handle the additional complexity of two shadow links + prefix discipline + schema-registry primary- site selection.

Stick with uni-directional hot-standby when:

  • Operational simplicity outweighs hardware utilisation.
  • One region is clearly the primary and the other is pure DR.
  • Schema registry primary-site selection is a meaningful complication the team wants to avoid.

Not to be confused with active-active

Reciprocal active-passive is not active-active multi-writer. The load-bearing distinction: every topic still has exactly one writer at any time (the cluster that owns its prefix). Two clusters never accept writes to the same logical topic, so there is no write-conflict resolution problem. The "active- active-ish" appearance comes from the aggregate workload being bidirectional, not from any individual topic being written on both sides.

This makes the pattern safe on single-writer streaming substrates (Kafka, Redpanda) that lack conflict-resolution primitives. True active-active on these substrates requires application-level partitioning that produces the same single-writer invariant, or a different storage substrate entirely (Spanner, CockroachDB, CRDT-based stores).

Seen in

Last updated · 550 distilled / 1,221 read