Skip to content

CONCEPT Cited by 1 source

Topic-prefix namespacing convention

Definition

Topic-prefix namespacing convention is the operational discipline of encoding the origin cluster into topic and consumer-group names via a short prefix (e.g. a_, b_, us_, eu_) when multiple clusters interoperate via cross-cluster replication. The prefix serves three purposes at once:

  1. Collision avoidance — the same logical topic on both clusters gets distinct names so they don't clash when replicated into a shared namespace.
  2. Prefix-based replication configuration — the shadow-link config can select topics and groups by prefix (e.g. "replicate everything starting with a_") instead of enumerating each one explicitly. New topics created with the convention are automatically included.
  3. Operator observability — a topic name alone tells you which cluster originates it; no lookup table required.

Canonical wiki source

Introduced by the 2026-04-21 Redpanda Shadow Linking deep-dive:

"This design benefits from using a consistent prefix to name topics and consumer groups, identifying their source site. In the example above, the prefixes of a_ and b_ in the topic names indicate which cluster they originate in."

"While not strictly necessary, the name prefixing is helpful for multiple reasons:

- Reduces the likelihood of topic/group naming clashes between sites - Simplifies shadow link configuration (topics and groups can be selected for replication on the basis of the prefix rather than needing a static list of topics and groups) - Helps operators know at a glance which site a topic originates from"

The three load-bearing benefits

1. Collision avoidance

In a reciprocal active-passive topology, both clusters want to write to topics and both clusters shadow-replicate each other. Without prefixes, a logical topic called orders on cluster A and orders on cluster B would collide when each cluster tries to hold both — which is the local one and which is the shadow?

Prefixes disambiguate: a_orders is written on A and shadowed to B; b_orders is written on B and shadowed to A. Both names exist on both clusters, each with a clear owner and a clear replication direction.

Without a convention, configuring which topics to replicate means enumerating each topic in the shadow link's config. Every new topic needs a config change; every removed topic needs a config change; the config drifts out of sync with the actual topic set on the cluster.

With a prefix convention, the shadow link config is a single rule: "replicate all topics matching a_*". New topics matching the prefix are automatically included as they are created. The config becomes declarative and stable rather than enumerative and drift-prone.

This generalises the pattern from "per-topic mirroring" to "topic-family mirroring" — which is the operationally sustainable shape when new topics appear regularly.

3. Operator observability

When an operator runs rpk topic list (or equivalent) and sees a_orders, b_inventory, a_shipments, the origin cluster is immediately obvious from the name. No lookup table, no wiki page, no "which DC does this topic live in really?" slack thread. The data-about-the-data is in the data's identifier.

This matters at incident time — an operator trying to diagnose a replication lag spike needs to know which cluster's producers are responsible for which topics. Prefix convention makes this a glance, not an investigation.

Variations in naming scheme

The 2026-04-21 post uses a_ / b_ as the simplest possible example. Production deployments might use:

  • Region codes: us_, eu_, ap_.
  • Datacenter codes: iad_, lhr_, nrt_.
  • Logical cluster names: prod_, staging_, dr_.
  • Organization-specific tenancy: team-a_, team-b_.

The load-bearing property is the prefix encodes ownership and is stable across the topic's lifetime. Any convention satisfying this works. Mixing conventions within a single cluster pair (some topics prefixed, others not) defeats the prefix-match-replication-config benefit.

Consumer groups too, not just topics

The 2026-04-21 post explicitly says "consistent prefix to name topics and consumer groups". The same prefix applies to consumer-group names because consumer-group commits are replicated alongside topic data (the five-axis replication enumerated in the post: data + configs + consumer-groups + ACLs + schemas). A consumer group a_analytics commits offsets that need to be replicated from A to B so a failover can resume the consumer on B; naming it with the same prefix makes the replication config symmetric on the consumer-group axis.

When the convention is not needed

A pure one-direction DR setup — cluster A is the only writer, cluster B only ever receives shadows — doesn't need the prefix. Every topic on A is unambiguously A's; every shadow on B is unambiguously from A. The convention starts to pay off once both clusters write (reciprocal active-passive) or once a cluster shadows from multiple sources.

Seen in

Last updated · 550 distilled / 1,221 read