CONCEPT

Multi-shard schema sync¶

Definition¶

Multi-shard schema sync is the problem of keeping schema identical across every shard of a horizontally- sharded database while a schema change is being rolled out — and its solution by gating the final cut-over across the whole shard fleet until every shard reports ready to seal. "By design, a multi-sharded database acts as though it were a monolith" — the schema monolithism is an explicit design invariant, and multi-shard schema sync is the operational discipline that preserves it during migrations.

Canonical verbatim framing:

"Different shards work under different workloads, and may run a schema migration to completion minutes to possibly hours apart. A multi-sharded database where different shards have different schemas can be either inconsistent in performance, or outright inconsistent in design. Our gated deployments minimize that gap period, by tracking the progress of a schema deployment across all shards and holding off the final switch to the new schema until all shards are ready. The switch then takes place almost simultaneously (though not atomically) on all shards."

(Source: )

Why shards drift without coordination¶

A Vitess-style sharded database applies DDL per shard, independently. Each shard runs its own shadow-table copy-and-swap migration:

Each shard has its own primary + replicas, own binlog, own workload, own live-data size.
A 500 GB-per-shard migration with even bandwidth finishes at about the same time across shards. A skewed workload — one shard absorbing heavier traffic, or one shard holding a hot table twice the size of the others — means some shards finish "minutes to possibly hours" later than others.
Between the first shard's cut-over and the last shard's cut-over, the cluster has schema skew: schema A on shards 1–6, schema B on shards 7–8, for the duration of the drift window.

The drift window's cost:

Performance inconsistency. The same query on different shards runs on different schemas; if the migration added an index, the index-using plan is available on some shards but not others. A scatter- gather query (cross-shard query) aggregates p99-latency contribution from the slowest-schema shard.
Design inconsistency. If the migration adds a column or changes a type, application-level reads expecting the new schema fail on old-schema shards. Route-by-WHERE-clause shard selection can land a read on an old-schema shard during the drift window even after the new-schema rollout is visible elsewhere.

The gating solution¶

The deploy controller aggregates readiness flags across shards. Each per-shard migration parks in staged-then- sealed state — copy phase complete, binlog tail running, ready-to-seal flag published. The gate does not open until every shard reports ready. When the gate opens, every shard cuts over "almost simultaneously (though not atomically)" — the window is compressed from minutes-to-hours of drift to seconds of cross-shard cut- over.

This is the same gating primitive that gated deployment applies across multiple changes within one shard — but applied across one change × N shards rather than M changes × one shard. The deploy controller's readiness-aggregation operates over the changes × shards matrix; the gate opens when every cell is green.

Two orthogonal dimensions of "multi-dimensional"¶

Shlomi Noach's framing in the 2022 Gated Deployments launch post:

"The problem begins with multi-dimensional deployments. With these, you will either have multiple schema changes in the same deployment, or have a single change deployed over a multi-sharded database, or both."

(Source: )

The two dimensions:

Dimension	Problem shape	Solved by
Multiple changes, one shard	Partial-deployment window	patterns/near-atomic-multi-change-deployment
One change, multiple shards	Per-shard drift window	multi-shard schema sync
Both	Combinatorial drift	Gate across changes × shards

Multi-shard schema sync is the single-change, multiple- shard solution. The 2023 successor post () canonicalises the single-shard, multiple-change case exhaustively; the 2022 launch post is where the multi- shard dimension first appears on the wiki.

Still "almost simultaneously (though not atomically)"¶

The post explicitly refuses to claim cross-shard cutover atomicity:

"The switch then takes place almost simultaneously (though not atomically) on all shards."

(Source: )

The within-shard cutover is already "near-atomic, not atomic" because MySQL lacks transactional multi-table DDL (concepts/near-atomic-schema-deployment). Adding shards multiplies the non-atomicity: even if every shard does its cutover in one second, eight shards still take eight seconds from first-cut to last-cut in the worst case. Clients doing scatter-gather reads during this window may observe schema skew.

This is a structural property of running MySQL per shard — the problem is not solvable at the deploy-controller layer, only minimised by the gating. The practical mitigation is that the window is seconds, so most reads do not straddle it.

Relationship to the monolith illusion¶

"A multi-sharded database acts as though it were a monolith" is a deliberate design invariant of Vitess- style sharding. Applications running against a sharded cluster expect:

One schema to read against (enforced via schema routing rules at VTGate).
One consistent query shape across shards.
Schema migrations as a single deployment event, not a per-shard rollout the app must coordinate against.

Multi-shard schema sync is the operational primitive that delivers the monolith illusion during migrations. Without it, clients would need to reason about per-shard schema state — or the deploy controller would need to force shards onto the same schema before serving traffic (which would require taking the whole cluster offline during migration, defeating the purpose of online DDL).

Composes with other gated-deployment primitives¶

Gate multiple changes × multiple shards. The complete deploy-controller state space is the readiness matrix (N changes × M shards = N×M flags). The gate opens when all N×M are green.
Cancel-before-cutover applies across the whole matrix. Cancelling a multi-shard deployment tears down shadow tables on every shard — no partial-shard cut-over is possible while the gate is closed.
30-minute revert window is measured from the final cross-shard cutover, not the first. The inverse replication streams stage per-shard after each shard's cut-over; the 30-minute clock starts when the multi-shard switch completes.

Seen in¶

— canonical wiki disclosure. Shlomi Noach's 2022 Gated Deployments launch post canonicalises the multi- shard dimension as one of two orthogonal axes under "multi-dimensional deployments." The post frames the problem (shards drift because workloads differ) and the solution (gate the cut-over across shards) but does not disclose the per-shard readiness-aggregation protocol.
— sibling canonical source. The 2023 successor post canonicalises the cross-change dimension in depth but inherits the multi-shard dimension from the 2022 launch post. Both posts are by Shlomi Noach; both are architecturally compatible.