SYSTEM Cited by 1 source

GitHub Pages¶

GitHub Pages is GitHub's static-site hosting product (*.github.io + custom domains). One of GitHub's earliest platform offerings, born simple — up to early 2015 the entire service ran on a single active/standby pair of machines with user data on 8 DRBD-backed partitions and a cron-regenerated nginx map file that translated hostnames to on-disk paths. Canonical wiki anchor for the simple components, don't prematurely generalise design philosophy ("avoiding prematurely solving problems that aren't yet problems").

2015 rearchitecture (canonical source)¶

Scale forcing function: "thousands of requests per second to over half a million sites" + storage ceiling of a single pair + 30-minute publish latency + cold-restart cost of loading a giant pre-generated nginx map. The rewrite splits the stack into two tiers:

Frontend routing tier — Dell C5220s running nginx with an ngx_lua script in access_by_lua_file. Per request, the Lua script queries a MySQL read replica for which backend fileserver pair hosts the requested site, sets $gh_pages_host + $gh_pages_path, and proxy_passes to the backend. Canonical wiki instance of patterns/db-routed-request-proxy.

Fileserver tier — pairs of Dell R720s running active/standby with DRBD-synchronous replication across 8 partitions; nginx document root = X-GitHub-Pages-Root header set by the router. Each pair is shaped identically to the pre-2015 single pair, so the rewrite reused "large parts of our configuration and tooling". Scaling is horizontal by adding pairs — patterns/horizontally-scale-stateful-tier-via-pairs.

Availability-dependency trade-off¶

The rewrite introduces a first-order availability dependency on MySQL — per-request routing lookups run against the DB. Four mitigations disclosed in the post:

Retry-on-error to a different read replica. The Lua router reconnects to an alternate replica on query failure.
30-second shared-memory cache of routing lookups on the pages-fe node. Reduces MySQL load and absorbs short blips. Canonical wiki instance of patterns/cached-lookup-with-short-ttl.
Reads go to replicas, not the master. MySQL master failovers
master-tier maintenance windows don't take Pages down — "existing Pages will remain online even during database maintenance windows where we have to take the rest of the site down".
Fastly in front caches every 200 response. Even a total router outage leaves cached Pages sites reachable. Canonical wiki instance of patterns/cdn-in-front-for-availability-fallback.

Performance¶

Disclosed: < 3 ms of each request in Lua at the p98, "including time spent in external network calls", across millions of HTTP requests per hour. The p98 bound covers the MySQL lookup path — the load-bearing datum for DB-routed proxy viability under production traffic.

Operational wins of the rewrite¶

Instant publishing: new Pages sites go live as soon as the MySQL row is written; no 30-minute cron wait.
No cold-restart penalty: nginx doesn't load a giant map at startup.
Horizontal storage scaling: storage capacity grows by adding fileserver pairs, not by fattening a single machine.

Historical note¶

The source post is an adaptation of GitHub's 2015 writeup, republished on github.blog in 2025. The architecture described is 10 years old; GitHub Pages has almost certainly evolved. Ingested here as a canonical historical-architecture reference for DB-routed static hosting at scale.

Seen in¶

sources/2025-09-02-github-rearchitecting-github-pages — 2015 rearchitecture from single-pair + 30-min map regen to MySQL-routed nginx frontend + DRBD-replicated fileserver pair tier + Fastly front-cache; <3 ms p98 in Lua at millions RPH.

systems/github — parent product; Pages is one of GitHub's earliest platform surfaces.
systems/nginx, systems/ngx-lua — frontend routing stack.
systems/drbd — fileserver-pair replication primitive.
systems/fastly — CDN fronting Pages.
systems/mysql — routing table; read replicas queried per request.
concepts/active-standby-replication — fileserver-pair HA shape.
patterns/db-routed-request-proxy — per-request routing via DB.
patterns/cdn-in-front-for-availability-fallback — cached 200s survive origin-router outage.
patterns/cached-lookup-with-short-ttl — 30 s shared-memory cache on hot lookups.
patterns/horizontally-scale-stateful-tier-via-pairs — scale by adding pairs, not by redesigning the pair.