CONCEPT Cited by 1 source

Cross Cluster Replication (CCR)¶

Cross Cluster Replication (CCR) is Elasticsearch's primitive for replicating index data between otherwise-independent Elasticsearch clusters. One cluster is the leader (read/write); one or more clusters are followers (read-only, continuously pulling from the leader). Replication is one-way at the Lucene segment granularity — the follower receives data that's already been durably persisted to disk on the leader.

The shape is distinct from ES's intra-cluster replication (where primary and replica shards of the same index live in one cluster): CCR is inter-cluster replication, a leader/follower pattern at the cluster level.

Wire model: leader / follower / auto-follow¶

Leader index — the source of writes; a normal index on the leader cluster.
Follower index — an index on a follower cluster configured to follow the leader. Reads allowed; writes go to the leader.
Auto-follow policy — pattern-matching rule ("follow all indexes whose names match app-* on cluster primary") that installs follower indexes automatically as new leader indexes are created.

Auto-follow's new-only gap¶

The auto-follow policy only matches indexes created after the policy is installed. Pre-existing leader indexes are not retroactively followed. Any system applying CCR to a long-lived deployment therefore needs a bootstrap step that enumerates pre-existing indexes and explicitly attaches followers before relying on the auto-follow policy — see patterns/bootstrap-then-auto-follow. This is a common failure mode for any policy-based replication/attachment primitive that is new-only.

What CCR handles vs what it doesn't¶

CCR replicates documents. It does not handle:

Failover orchestration — promoting a follower to leader after leader loss, re-pointing clients, coordinating config changes.
Index deletion coordination — a leader-side delete must be echoed on the follower (or the follower will recreate the deleted index from the leader's history).
Upgrade ordering — ES-version compatibility constraints between leader and follower during rolling upgrades.
Multi-leader — CCR is one-way leader→follower; bidirectional replication requires per-index-role separation and external arbitration.

These lifecycle responsibilities fall to whoever owns the application layer. GitHub's GHES 3.19.1 rewrite is an explicit instance of this: "Elasticsearch only handles the document replication, and we're responsible for the rest of the index's lifecycle." (Source: sources/2026-03-03-github-how-we-rebuilt-the-search-architecture-for-high-availability)

Why the segment-level grain matters¶

Because CCR replicates at the Lucene-segment level (immutable, durable), the follower cluster is never ahead of the leader's durable state, and the replication stream is replay-safe. This is the same "replicate-durable-state, not in-flight-operations" property that makes streaming logical replication safe in Postgres, block-level async replication safe in EBS snapshots, and WAL-based physical replication safe in distributed SQL systems.

CCR is structurally an instance of CDC (stream of durable storage changes) — at the Lucene-segment grain rather than the row grain.

Sibling replication primitives¶

CDC — row-level analog.
patterns/block-level-continuous-replication — analog at the block-storage layer (AWS Elastic DR, Arpio).
In-cluster primary-replica-shard replication — ES's other replication mechanism, intra-cluster, finer-grained, but doesn't give you cluster-level leader/follower semantics.

Canonical production shape (on the wiki)¶

The GHES HA-search rewrite shipped in GHES 3.19.1 is the wiki's canonical CCR production instance: collapse a multi-node ES cluster into N independent single-node clusters aligned to the app-layer primary/replica topology, then join them with CCR. See patterns/single-node-cluster-per-app-replica.

Seen in¶

sources/2026-03-03-github-how-we-rebuilt-the-search-architecture-for-high-availability — GHES 3.19.1 uses CCR to replicate Lucene segments from primary's single-node ES cluster to replica's single-node ES cluster, replacing a failure-prone cross-node cluster. GitHub authored lifecycle workflows (failover, index deletion, upgrades, bootstrap) around CCR's document-replication primitive.

systems/elasticsearch — the vendor primitive.
systems/lucene — segment-level replication boundary.
systems/github-enterprise-server — canonical consumer.
concepts/primary-replica-topology-alignment — the structural condition CCR allows you to express.
patterns/single-node-cluster-per-app-replica — the deployment pattern CCR unlocks.
patterns/bootstrap-then-auto-follow — the new-only-policy workaround pattern.
concepts/change-data-capture — the general class CCR belongs to.
patterns/block-level-continuous-replication — block-layer analog.