CONCEPT Cited by 1 source
Cross Cluster Replication (CCR)¶
Cross Cluster Replication (CCR) is Elasticsearch's primitive for replicating index data between otherwise-independent Elasticsearch clusters. One cluster is the leader (read/write); one or more clusters are followers (read-only, continuously pulling from the leader). Replication is one-way at the Lucene segment granularity — the follower receives data that's already been durably persisted to disk on the leader.
The shape is distinct from ES's intra-cluster replication (where primary and replica shards of the same index live in one cluster): CCR is inter-cluster replication, a leader/follower pattern at the cluster level.
Wire model: leader / follower / auto-follow¶
- Leader index — the source of writes; a normal index on the leader cluster.
- Follower index — an index on a follower cluster configured to follow the leader. Reads allowed; writes go to the leader.
- Auto-follow policy — pattern-matching rule ("follow all
indexes whose names match
app-*on clusterprimary") that installs follower indexes automatically as new leader indexes are created.
Auto-follow's new-only gap¶
The auto-follow policy only matches indexes created after the policy is installed. Pre-existing leader indexes are not retroactively followed. Any system applying CCR to a long-lived deployment therefore needs a bootstrap step that enumerates pre-existing indexes and explicitly attaches followers before relying on the auto-follow policy — see patterns/bootstrap-then-auto-follow. This is a common failure mode for any policy-based replication/attachment primitive that is new-only.
What CCR handles vs what it doesn't¶
CCR replicates documents. It does not handle:
- Failover orchestration — promoting a follower to leader after leader loss, re-pointing clients, coordinating config changes.
- Index deletion coordination — a leader-side delete must be echoed on the follower (or the follower will recreate the deleted index from the leader's history).
- Upgrade ordering — ES-version compatibility constraints between leader and follower during rolling upgrades.
- Multi-leader — CCR is one-way leader→follower; bidirectional replication requires per-index-role separation and external arbitration.
These lifecycle responsibilities fall to whoever owns the application layer. GitHub's GHES 3.19.1 rewrite is an explicit instance of this: "Elasticsearch only handles the document replication, and we're responsible for the rest of the index's lifecycle." (Source: sources/2026-03-03-github-how-we-rebuilt-the-search-architecture-for-high-availability)
Why the segment-level grain matters¶
Because CCR replicates at the Lucene-segment level (immutable, durable), the follower cluster is never ahead of the leader's durable state, and the replication stream is replay-safe. This is the same "replicate-durable-state, not in-flight-operations" property that makes streaming logical replication safe in Postgres, block-level async replication safe in EBS snapshots, and WAL-based physical replication safe in distributed SQL systems.
CCR is structurally an instance of CDC (stream of durable storage changes) — at the Lucene-segment grain rather than the row grain.
Sibling replication primitives¶
- CDC — row-level analog.
- patterns/block-level-continuous-replication — analog at the block-storage layer (AWS Elastic DR, Arpio).
- In-cluster primary-replica-shard replication — ES's other replication mechanism, intra-cluster, finer-grained, but doesn't give you cluster-level leader/follower semantics.
Canonical production shape (on the wiki)¶
The GHES HA-search rewrite shipped in GHES 3.19.1 is the wiki's canonical CCR production instance: collapse a multi-node ES cluster into N independent single-node clusters aligned to the app-layer primary/replica topology, then join them with CCR. See patterns/single-node-cluster-per-app-replica.
Seen in¶
- sources/2026-03-03-github-how-we-rebuilt-the-search-architecture-for-high-availability — GHES 3.19.1 uses CCR to replicate Lucene segments from primary's single-node ES cluster to replica's single-node ES cluster, replacing a failure-prone cross-node cluster. GitHub authored lifecycle workflows (failover, index deletion, upgrades, bootstrap) around CCR's document-replication primitive.
Related¶
- systems/elasticsearch — the vendor primitive.
- systems/lucene — segment-level replication boundary.
- systems/github-enterprise-server — canonical consumer.
- concepts/primary-replica-topology-alignment — the structural condition CCR allows you to express.
- patterns/single-node-cluster-per-app-replica — the deployment pattern CCR unlocks.
- patterns/bootstrap-then-auto-follow — the new-only-policy workaround pattern.
- concepts/change-data-capture — the general class CCR belongs to.
- patterns/block-level-continuous-replication — block-layer analog.