SYSTEM Cited by 1 source
GitHub Enterprise Server¶
GitHub Enterprise Server (GHES) is GitHub's self-hosted distribution: the customer runs the appliance in their own datacenter / cloud, operates the HA pair themselves, and upgrades on their own cadence. Distinct from GitHub Enterprise Cloud (GHEC) (SaaS, GitHub-operated) — the managed github.com page covers GHEC.
Deployment topology¶
A typical HA GHES install is two appliance nodes:
- Primary node — receives all writes, all user traffic.
- Replica node — stays in sync, takes over on failover. Designed as read-only in steady state.
This leader/follower invariant runs deep: every GHES subsystem is aware of it, and anything that writes to shared state must live on the primary side.
Search substrate: 2026 CCR-based rewrite¶
Pre-rewrite topology (problem)¶
Historically, GHES ran one Elasticsearch cluster spanning the primary and replica GHES nodes. This was a forced design choice: classic Elasticsearch didn't support a leader/follower primary-replica-cluster pattern, so the only way to get search data from primary to replica was to let ES itself form a cross-node cluster with nodes on both GHES hosts.
This misaligned the storage topology with the application topology. Two failure modes followed:
- Index maintenance footguns: following the wrong upgrade / maintenance sequence could leave search indexes damaged and requiring repair, or locked during upgrades.
- Mutual-blockage deadlock: ES was free to rebalance a primary shard onto the replica GHES node. If the replica was then taken down for maintenance, the replica waited for ES-cluster health before starting up, while ES couldn't become healthy until the replica rejoined.
Multi-release mitigations (health-check gates, drift-correction processes, an abandoned in-house "search mirroring" DB-replication effort) did not fix the root cause — they could not make ES behave like a leader/follower system when the cluster spanned both appliance nodes.
Post-rewrite topology (solution)¶
GHES 3.19.1 (opt-in) ships a rewrite: collapse to one single-node Elasticsearch cluster per GHES node, link them with Elasticsearch Cross Cluster Replication (CCR). Primary's ES cluster is the CCR leader, replica's ES cluster is the CCR follower; CCR replicates at the Lucene segment level — i.e. data that has already been durably persisted at the leader. See concepts/cross-cluster-replication and patterns/single-node-cluster-per-app-replica.
The rewrite is a canonical in-wiki instance of concepts/primary-replica-topology-alignment — the storage layer's replication direction now matches the application layer's write-ownership direction, and the failure mode is impossible by construction (ES can't move a primary shard to the follower cluster — there's no cross-cluster rebalancing in CCR).
Lifecycle workflows GitHub owns on top of CCR¶
Elasticsearch only handles document replication over CCR. Everything else is the customer's responsibility:
- Bootstrap: CCR's
auto-followAPI only covers indexes created after the policy exists. GHES has a long-lived set of pre-existing indexes, so the rewrite adds an imperative bootstrap step that enumerates current indexes, attaches follower contracts, and then installs the auto-follow policy for future indexes. See patterns/bootstrap-then-auto-follow. - Failover workflow — moving CCR leader role from failed primary to promoted replica.
- Index deletion workflow — coordinating deletion across leader
- follower so CCR doesn't recreate the index after the leader deletes it.
- Upgrade workflow — ordering ES upgrades + CCR version compatibility for rolling / non-rolling paths.
Elasticsearch handles only the document-replication leg. Everything else — the full index lifecycle — is GitHub-authored code around CCR.
Enabling CCR mode¶
- Customer contacts GitHub Support, who provisions the required license.
- Run
ghe-config app.elasticsearch.ccr true. - Run
config-applyor upgrade the HA cluster to 3.19.1. - On restart, ES consolidates all data onto the primary, breaks cross-node clustering, and restarts replication via CCR.
The migration duration scales with instance size (no numbers disclosed). Default-on rollout is planned over the next two years. (Source: sources/2026-03-03-github-how-we-rebuilt-the-search-architecture-for-high-availability)
Relationship to github.com / GHEC¶
GHES ships the same application as github.com but in an appliance-deployable, HA-pair topology. github.com operates a much larger search infrastructure (not a two-node HA pair) — the rewrite described here is a GHES-specific topology choice; it does not imply any change to the github.com / GHEC search stack.
Stub caveats¶
- This page is stubbed around the 2026-03 CCR-search rewrite. Other GHES surfaces (HA for MySQL / Redis / Git storage, backup tools, ephemeral-node deployments) are not yet documented here.
- GHES's version-support / deprecation policy, customer-operable knobs, and appliance-image packaging are out of scope for this stub.
- Licensing specifics of the CCR-enabled ES distribution shipped with GHES 3.19.1 are not documented in the source.
Seen in¶
- sources/2026-03-03-github-how-we-rebuilt-the-search-architecture-for-high-availability — 2026-03 rewrite of HA search: monolithic cross-node ES cluster → per-node single-node ES clusters linked by CCR. GHES 3.19.1 opt-in, default over ~2 years. GitHub-authored lifecycle workflows (failover / deletion / upgrade / bootstrap) on top of CCR's document-replication primitive.
Related¶
- systems/github — the product; GHES is its self-hosted deployment form.
- systems/elasticsearch — the search substrate; CCR is an Elasticsearch feature.
- concepts/cross-cluster-replication — the generalised primitive underpinning the rewrite.
- concepts/primary-replica-topology-alignment — the structural lesson the rewrite exemplifies.
- patterns/single-node-cluster-per-app-replica — the deployment shape the rewrite adopts.
- patterns/bootstrap-then-auto-follow — the imperative-then- declarative setup pattern CCR's new-only auto-follow policy forces.