Skip to content

CONCEPT Cited by 1 source

Durability vs. consistency guarantee

Definition

Durability and consistency are two orthogonal guarantees a replicated system can offer. A system can provide one without the other. Confusing them — or assuming that a mechanism that buys durability also buys consistency — is one of the most common errors in replication-topology design.

Shlomi Noach makes the distinction plainly in the context of MySQL semi-sync:

"The most straightforward use case for semi-sync is to guarantee durability of the data. We only approve a commit once the transaction (in the form of a changelog) is persisted elsewhere, on a different server, or on multiple servers. […] The term 'consistency' is overloaded, and especially in distributed systems." (Source: sources/2026-04-21-planetscale-mysql-semi-sync-replication-durability-consistency-and-split-brains)

The two axes

Durability is a property of data survival: if the primary tells you a commit succeeded, the data is persisted somewhere that survives the primary's failure. The question is "will this data survive the next crash?"

Consistency is a property of post-failover state: if you promote a new primary after the old one fails, its state is a correct, well-defined continuation of what the old primary had advertised. The question is "what will readers see, and does it contradict what was previously acknowledged?"

Noach's canonical counterexample: durable but not consistent

The MySQL semi-sync 1-n deployment with primary + R1 in DC-A and R2/R3/R4 in DC-B:

  1. UPDATE arrives on primary. R1 (same DC, fast) acks. Primary commits, tells user "OK".
  2. DC-A isolates. The UPDATE is on primary + R1 but nowhere else.
  3. Ops promotes R2 to cut the outage. R2 never saw the UPDATE.
  4. Readers against new-primary R2 see stale state. Previously-confirmed commit is invisible.

The data is durable — it exists on primary + R1, both alive, in the isolated DC. A human could physically drive to DC-A and recover it. But the system is not consistent — the promise "a new primary will be a correct continuation of the old advertised state" is broken. The commit the user was told succeeded is not visible to the new primary.

Noach's definition of consistency

The post rejects the strict CAP-Theorem definition ("once data was written on one server, any immediate follow-up read on any other server must reflect that write") as too strict for the failover conversation. Instead:

"We will suffice with eventual consistency. This in itself an overloaded term, so let's clarify: in the case of a primary outage, we consider our system consistent if we're able to promote a new primary within any amount of time (obviously in practice we expect the time to be short) such that it is consistent with the previous primary's advertised state. I write 'advertised' state because the end user or applications should not be surprised by any data changes when the new primary is promoted, regardless of how transactions/replication work internally."

This is a failover-centric consistency definition: the new primary must not retract acknowledged writes. It's weaker than linearizability (which constrains real-time read order), stronger than pure eventual consistency (which would allow promoting a stale replica and letting writes eventually merge).

The asymmetric cost of getting it wrong

  • Durability-without-consistency (what semi-sync on DC-isolation produces): users have been told their data is safe, but readers can't see it. Recovery requires human intervention to reconcile divergent writes. Split-brain is the extreme form.
  • Consistency-without-durability (e.g. async-only replication with synchronous quorum reads): readers always see a coherent view, but a crash at the wrong moment loses acknowledged writes. Data loss.

Most production MySQL deployments accept small durability-without-consistency windows (they can be reconciled manually) but refuse consistency-without-durability (data loss is not recoverable).

What buys each

Guarantee Typical mechanism
Durability only Semi-sync with k ≥ 1 replicas in a persistent store; binlog persistence; disk-level replication.
Consistency only Write to a single node; reads go to same node. (Works until that node fails.)
Both Consensus (Paxos/Raft), quorum-read-and-write; or lock-based leader election with leases and cross-DC quorum durability.

Getting both requires coordination on both the write path and the read/failover path. Semi-sync only touches the write path — hence the durability-only outcome.

Seen in

Last updated · 378 distilled / 1,213 read