Skip to content

PATTERN Cited by 1 source

Single-Node Cluster per App Replica

Pattern

Deploy one single-node storage cluster on each application replica host and link those per-host clusters together via store-level replication, instead of running one multi-node storage cluster spanning all the application replicas.

When to use

The precondition is a storage system that natively wants to form multi-node clusters (and rebalance across them) being deployed inside an application that has a strict primary/replica write-ownership invariant — and no requirement to shard the storage across multiple hosts beyond the app-replica count. See concepts/primary-replica-topology-alignment for the structural motivation.

Concrete fit:

  • Storage system offers cluster-to-cluster replication as a first-class primitive (Elasticsearch CCR, etc.).
  • App has ~2 (primary + replica) hosts; rare to scale horizontally per-tenant beyond that.
  • The failure modes of the cross-cluster-rebalancing-on-one-ES-cluster shape (primary shard migrating to the read-only replica → stuck-replica deadlock, see concepts/primary-replica-topology-alignment) are observable or known.

Why it works

  • The storage layer loses its rebalancing freedom by construction. There's no cross-host cluster anymore, so ES (or whatever) cannot move write-owning shards onto the follower host. The app's primary/replica invariant is enforced at the topology level, not by a policy that could be misconfigured.
  • Lifecycle decouples from topology. Each per-host cluster can be upgraded, restarted, backed up, diagnosed independently — the blast radius of a single-cluster operation is one app replica, not the whole HA pair.
  • Replication is leader→follower, explicit, and one-way. The storage layer's replication topology mirrors the app's write-ownership topology exactly.

Trade-offs

  • No horizontal scaling of a single index across machines. A single-node cluster can scale vertically (bigger machine, more RAM, more disk) but cannot shard an index across multiple hosts. For search indexes that fit on one host, this is fine; for TB-per-replica corpora it's a ceiling.
  • Custom lifecycle glue. Store-level replication (e.g. CCR) usually covers only document replication. Failover, index deletion, upgrades, and bootstrap for pre-existing indexes are the customer's problem. GHES's rewrite acknowledges this explicitly: "Elasticsearch only handles the document replication, and we're responsible for the rest of the index's lifecycle."
  • New-only auto-follow. CCR-class primitives typically have a pattern-based auto-follow policy that only matches new indexes; pre-existing indexes need an imperative bootstrap step. See patterns/bootstrap-then-auto-follow.
  • One-way migration. Collapsing an existing multi-node cluster into N single-node clusters is a data-consolidation step that is difficult to reverse in the same release. GHES's rewrite "consolidates all the data onto the primary nodes, breaks clustering across nodes, and restarts replication using CCR" at upgrade time.

Canonical wiki instance

GHES 3.19.1 HA search rewrite (2026-03-03) is the canonical production wiki instance. Pre-rewrite: one Elasticsearch cluster spanning the primary and replica GHES nodes, with primary-shard rebalancing free to move writers onto the replica — a topology-misalignment failure mode with a mutual-dependency deadlock on replica-down. Post-rewrite: per-node single-node ES clusters joined by CCR; GHES authored failover / deletion / upgrade / bootstrap workflows on top. (Source: sources/2026-03-03-github-how-we-rebuilt-the-search-architecture-for-high-availability)

Generalization

The pattern generalizes to any pairing of:

  • A storage layer that natively forms multi-node clusters and rebalances writer-owning shards.
  • An application with a strict primary/replica invariant at its write-ownership level.
  • A store-level inter-cluster replication primitive at an appropriate grain (segment, block, row).

Candidate shapes elsewhere: per-node PostgreSQL with logical replication to a read-only follower rather than Patroni-managed multi-node cluster; per-node Kafka broker streaming to a DR follower broker via MirrorMaker rather than stretched-cluster setups. The discipline is the same: keep the replication direction locked to the app's write-ownership direction.

Seen in

Last updated · 200 distilled / 1,178 read