CONCEPT Cited by 1 source
Schema disagreement¶
Definition¶
Schema disagreement is a state where different nodes of a distributed database hold different versions of the schema — as advertised via the schema version that gossips with heartbeat state. The cluster should converge to a single schema version; when it doesn't, cross-node operations that depend on schema (prepared statement planning, repairs, replication) can fail or behave inconsistently.
Cassandra specifically gossips a schema version per node as part
of its EndPointState
(see systems/apache-cassandra for the EndPointState shape).
The pre-upgrade gate Yelp uses is "Ensuring Cassandra schema
versions are fully in agreement across the cluster".
When it appears during upgrades¶
Two distinct moments in the upgrade lifecycle:
- Pre-flight gate — refuse to start an upgrade if the cluster already has schema disagreement. This is a cheap check that prevents one class of cluster fault from being baked into the upgrade state.
- Post-upgrade surprise — schema disagreement can appear after an upgrade that started from a clean state, especially on clusters with non-default workloads (e.g. CDC enabled). Root cause is often not well understood; remediation tends to be "nudge schema convergence from multiple nodes."
Remediation: dummy multi-node schema changes¶
The empirically-effective remediation reported by Yelp is making dummy schema changes from multiple nodes to force gradual convergence. "We found that making dummy schema changes from multiple nodes after the upgrade led to gradual schema convergence. This approach served as an effective remediation."
The mechanism: a schema change produces a new schema version on the node that issues it; gossip propagates the new version; if multiple nodes push in parallel the cluster is forced through schema-negotiation rounds it otherwise might be stuck on.
Seen in¶
- sources/2026-04-07-yelp-zero-downtime-cassandra-4x-upgrade — canonical wiki Seen-in. Yelp's experience post-upgrading CDC-enabled Cassandra clusters 3.11 → 4.1. Root cause "not fully understood" — the remediation is empirical.
Related¶
- systems/apache-cassandra — the canonical gossip-schema- versioning datastore.
- concepts/schema-evolution — the broader concept of schema change handling.
- concepts/mixed-version-cluster — the upgrade state that can precipitate schema disagreement.
- concepts/gossip-protocol — the transport that carries schema-version heartbeats in Cassandra.