CONCEPT

Minority-quorum writeability¶

Definition¶

Minority-quorum writeability is the structural property of a replication system in which a strict minority of the nodes can still satisfy the write-acknowledgement rule — meaning writes can proceed even when a majority of the cluster is absent. It is the specific structural feature that distinguishes MySQL semi-sync from a consensus protocol like Paxos or Raft, and it is the structural reason semi-sync admits split-brain even when the majority of sites remain online.

Noach's canonical observation¶

Shlomi Noach names the property directly in the semi-sync 1-n deployment:

"Of special interest is that in our '1-n' scenario, we have a quorum of two servers out of five or more. The primary, with a single additional replica, are able to form a quorum and to accept writes. That's how we got to have a split brain. While R2, R3, R4 form a majority of the servers, writes took place without their agreement." (Source: )

Five-node cluster, rpl_semi_sync_master_wait_for_slave_count=1. Primary + one replica = 2 nodes = strict minority. Writes commit. If the primary's partition contains only those two nodes and the other three form a majority, both sides can make progress independently — the classic split-brain topology.

Why semi-sync is structured this way¶

The design decision is a performance trade-off. MySQL semi-sync is layered on top of asynchronous replication as an optimisation, not rebuilt as a consensus protocol. The ack rule is the minimum number of replicas the primary must hear from — tuned low (typically 1) to keep write latency small and tolerate replica-side lag without blocking. It was never designed to prevent concurrent leadership; that was assumed to be a failover-layer concern.

A majority-quorum rule would require rpl_semi_sync_master_wait_for_slave_count ≥ (N/2)+1, and every write would incur the slowest-majority-replica latency. For a 5-node cluster that's 3 acks per write; for a 7-node cluster, 4. The cost on the hot path is substantial, and it still doesn't prevent split-brain by itself — you'd also need per-write proposal numbers, leader leases, and quorum reads.

The structural split-brain story¶

With minority-quorum writeability, a partition that puts primary + any k replicas on one side keeps that side writable. If any failover mechanism on the other side (majority, but without ack-visibility into the primary's side) promotes a new primary, the two sides make independent progress:

Partition separates {primary, R1} from {R2, R3, R4}.
Old primary + R1 keep committing writes (satisfies wait_for_slave_count=1).
Majority side detects primary-unreachable; promotes R2.
New primary + R3/R4 also commit writes.
Two independent write streams → divergent data.

Noach's point is that the majority of the cluster being up does not prevent this, because the write rule never consulted the majority.

Contrast with majority-quorum protocols¶

Under Paxos/Raft:

Writes require a majority ack — impossible from a minority partition.
Leader election requires a majority vote — a minority partition can't elect.
Split-brain is precluded at the protocol level (the minority partition simply cannot make progress).

The cost is exactly the per-write majority-round-trip latency that semi-sync avoided.

The reconciliation alternative: pluggable durability¶

The modern architectural response is pluggable durability rules (FlexPaxos-inspired): instead of a single wait_for_slave_count integer, express durability as a predicate over the replica set — "at least one ack from ≥2 zones", "at least N acks with at least M from region X", etc. The plugin can be configured to make minority-quorum writeability topologically impossible while still expressing common deployment shapes. MySQL semi-sync does not offer this expressiveness.

Operational counter-measures¶

Operators running semi-sync at multi-DC scale mitigate minority-quorum writeability operationally:

Fence the old primary at the network / VIP layer before promoting a new one, so the minority-writeable side can't keep committing.
Accept the outage rather than promote when the primary partition might still be serving — wait for the partition to heal.
Force reparenting through an anti-flapping controller (Orchestrator / VTOrc) that imposes dwell-time between leadership changes so rapid partition cycles can't produce interleaved writes.

None of these are protocol-level guarantees; they are empirical engineering controls that work because operators have tuned them against observed partition shapes.

Seen in¶

— canonical wiki introduction. Noach identifies the property as the structural root cause of semi-sync split-brain and uses it to close the post's argument that operators familiar with Paxos/Raft will find semi-sync topology counterintuitive.

concepts/split-brain — the failure mode this property produces.
concepts/mysql-semi-sync-replication — the mechanism that exhibits this property.
concepts/no-distributed-consensus — the family of protocols designed to avoid this property.
concepts/durability-vs-consistency-guarantee — the deeper axis this property sits on.
concepts/mysql-semi-sync-split-brain — Sugu's crash-restart variant of the same-family hazard.
patterns/pluggable-durability-rules — the architectural response that would let durability rules express zone/region coverage instead of node-count.
patterns/cross-dc-semi-sync-for-durability — the topology pattern that addresses a related axis.
systems/mysql