SYSTEM Cited by 1 source
gh-ost¶
Definition¶
gh-ost (GitHub Online Schema Migration Tool) is a
triggerless, binlog-based online schema change tool
for MySQL, open-sourced by GitHub in 2016. It executes
arbitrary ALTER TABLE statements against a live
production table without blocking writes and without
causing sustained replication lag, by (a) creating a
ghost table that is an empty copy of the original
with the new schema applied, (b) backfilling it from a
consistent snapshot of the original, (c) tailing the
binlog to capture concurrent writes and replay them onto
the ghost, and (d) atomically renaming the tables at a
brief cut-over. Canonical implementation of the
shadow-
table online schema change pattern. Repo:
github.com/github/gh-ost.
Positioned against pt-online-schema-change (Percona Toolkit): pt-osc uses triggers on the original table to mirror writes onto the ghost; gh-ost replaces triggers with binlog tailing, which decouples the migration load from the primary's write path and makes progress externally observable/pauseable.
Seen in¶
-
sources/2026-04-21-planetscale-the-state-of-online-schema-migrations-in-mysql — Shlomi Noach (PlanetScale, 2024-07-23). Canonical 2024-era taxonomic-peer framing: gh-ost is named alongside pt-osc, "recent newcomer" spirit, and Vitess as the four third-party shadow-table tools. Canonical six-property operational profile shared by all four (mimic-alter + slower + extra-disk + binlog-bloat + throttle-respecting + batched-interruptible). Canonical gap: gh-ost does not auto-detect
ALGORITHM=INSTANTeligibility the way Vitess and spirit do — it always executes the shadow-table path. Operators running gh-ost who want theINSTANTfast path must invoke it outside the tool's workflow. Noach is gh-ost's original author (open-sourced from GitHub 2016); his implicit verdict in the 2024 survey is that the default-shadow-table-with-no-INSTANT-short-circuit design remains correct — "if you already have to use one of the 3rd party solutions, you may as well use it all the time." Positions gh-ost as complementary to, not replaced by,INSTANT+ the auto-detect pattern — the underlying shadow-table mechanism is "still the go-to solution" for the majority of 2024-era production schema changes. -
— Lucy Burns, PlanetScale, 2021-05-20. Canonical stack disclosure:
"Using Vitess and gh-ost under the hood, we provide our users with a safe, easy, reliable way to push schema changes to production." PlanetScale wraps gh-ost (alongside Vitess) as the underlying migration engine inside its branch-based deploy-request workflow. Positions gh-ost against
pt-online-schema-changein the prior-art taxonomy: both provide online schema migration, but "these tools are often run manually and require the support of additional infrastructure" — PlanetScale's contribution is the managed workflow around gh-ost, not gh-ost itself. Links out to GitHub's own 2016 announcement: github.blog/2016-08-01-gh-ost-github-s-online-migration-tool-for-mysql.
Mechanism summary¶
Four-phase shape (see patterns/shadow-table-online-schema-change for the full pattern write-up):
- Create ghost table —
CREATE TABLE _tbl_gho LIKE tbl, then apply the user'sALTERto_tbl_gho. Ghost is empty. - Backfill — copy rows from the original in ordered chunks under a consistent snapshot.
- Apply binlog events — tail the primary's binlog;
each concurrent
INSERT/UPDATE/DELETEontblis replayed onto_tbl_gho. Runs concurrently with step 2. - Cut-over — atomic rename:
tbl → _tbl_del,_tbl_gho → tbl. Original table is kept as_tbl_delfor quick rollback.
Key distinguishing traits vs pt-online-schema-change:
- Triggerless. Uses binlog tailing instead of per-row triggers on the original table. Reduces primary-write overhead.
- Throttle-aware. Exposes throttle hooks on replica-lag, load average, and a control file — migration can be paused/resumed externally without killing the job. This design heavily influenced the later Vitess throttler abstraction.
- Interruptible / resumable. The migration writes progress state to the ghost table itself; restarts continue from the last chunk.
Where it runs¶
- GitHub's own MySQL fleet — gh-ost was built for GitHub's production migrations; see GitHub Engineering, 2016.
- PlanetScale — the migration engine under PlanetScale's deploy-request workflow in the 2021-era architecture; confirmed by .
- Standalone by operators — independently deployable against any MySQL primary with binlog access.
Relationship to Vitess¶
Vitess has its own online-DDL implementation (see VReplication-driven schema changes and systems/vitess-schemadiff) which largely supersedes gh-ost for Vitess-native deployments — but gh-ost was the earlier, standalone tool that influenced the design of both. PlanetScale's 2021 architecture composes both: Vitess for orchestration, gh-ost for the migration engine.
Citations¶
-
Lucy Burns, Non-blocking schema changes, PlanetScale, 2021-05-20 () — canonical downstream use as PlanetScale's migration engine.
-
Shlomi Noach, The promises and realities of the relational database model, PlanetScale, 2021-07-13 () — Noach's earliest wiki-ingested PlanetScale essay. Names gh-ost ("people will commonly use 3rd party tools such as gh-ost or pt-online-schema-change, which run an online schema change through emulation and replacement") and canonicalises the load-bearing limitation: "these require access to your production system. The developer needs to understand how to invoke these tools; how to configure throttling; how to observe and monitor their progress; how to clean up their artifacts." gh-ost is necessary but not sufficient for developer-owned schema change — the mechanism is correct, but six operational skills (metadata-locking, failure-mode literacy, production topology, tool invocation, throttling configuration, cleanup) are required to drive it, and a human must own coordination + failure recovery + scheduling that gh-ost does not provide. The 2021 post is the problem-shape-naming prequel to Noach's 2022 paradigm essay; together they frame gh-ost as a partial solution whose gaps PlanetScale's deploy- request + branching + queue layer fills. This post predates Noach's 2022 paradigm essay by 10 months but sits structurally underneath it.
-
Shlomi Noach, The operational relational schema paradigm, PlanetScale, 2022-05-09 () — Noach is the ex-GitHub engineer who created gh-ost. This essay is his foundational principles charter for what the mechanisms he built (gh-ost) + the product he joined (PlanetScale) are structurally trying to achieve. The ten tenets map directly onto gh-ost's design goals — non-blocking (tenet 1), resource-aware throttling (tenet 2), asynchronous execution (tenet 3), interruptible (tenet 5), ETA-observable (tenet 6), failover-agnostic (tenet 7 via persistent state). gh-ost is the mechanism embodiment of tenets 1, 2, 5, 6, 7, and 8 (the retained shadow table that enables schema revert); the deploy-request + branching + queue layer PlanetScale wraps around gh-ost adds tenets 3, 4, and 9 at the workflow altitude.