Skip to content

CONCEPT Cited by 1 source

Availability vs data-loss trade-off

Definition

The availability-vs-data-loss trade-off is the operational decision a consensus-backed system must make when concurrent failures exceed the failure-tolerance envelope: either stall the system (preserving durability — no acknowledged write is lost — but costing availability for the duration) or abandon the unreachable nodes (preserving availability but accepting the potential loss of writes that were ack'd only by the unreachable side).

This trade-off is a choice the operator makes, not a protocol property. The consensus protocol's responsibility is to detect the out-of-envelope condition and surface the choice; the protocol cannot make it because the right answer depends on the business cost of lost data versus the business cost of unavailability — both external to the protocol.

Canonical framing

Sugu Sougoumarane's Part 3 statement:

"If this were to happen, the system has to allow for a compromise: abandon the two nodes and move forward. Otherwise, the loss of availability may become more expensive than the potential loss of that data." (Source: sources/2026-04-21-planetscale-consensus-algorithms-at-scale-part-3-use-cases)

The essay deliberately refuses to recommend one side over the other. The message is that both are legitimate, and the operator — not the protocol designer — decides.

The two options

Stall (preserve correctness)

  • Refuse to elect a new leader until the unreachable nodes rejoin.
  • Preserve all previously-ack'd writes: nothing is lost.
  • Costs availability: the system cannot serve new writes (and possibly reads) until the partition heals.
  • Appropriate for: financial ledgers, payment systems, audit trails, any workload where data loss is a compliance or contractual violation.

Abandon (preserve availability)

  • Elect a new leader from the reachable subset.
  • Accept the potential loss of any writes that were ack'd by nodes in the unreachable set but never propagated to the reachable set.
  • Preserves availability: the system continues serving new writes and reads.
  • Appropriate for: cache-coherence fabrics, telemetry pipelines, session stores, any workload where short periods of unavailability are operationally worse than small loss of recent data.

Why the protocol can't decide

The business-cost asymmetry is workload-specific:

  • A retail payment system: an hour of stall is measurable in lost revenue (real dollars). A lost payment is measurable in reputational damage + audit-failure risk. The right answer is usually stall — an outage is recoverable; a lost-charge claim from a customer is not.
  • A collaborative editing tool's presence service: an hour of stall is disaster (users think the tool is broken). A lost "user-is-typing" event is imperceptible. The right answer is usually abandon — availability dominates.

No protocol can know which workload it is running. The protocol's responsibility is to expose the choice (via operator-facing controls: a flag, a CLI command, a confirmation dialog, or a runbook-triggered automation path) rather than silently choosing one side and surprising the operator.

When does this apply?

Only when failures exceed the failure-tolerance envelope. Within the envelope:

  • A single node failure on a 5-node, 2-durability system is survivable (4 nodes remain, election reaches enough of them to intersect any possible write set). No trade-off is needed; the protocol handles it.
  • A zone failure on a 3-zone deployment with per-zone durability predicate is survivable (the other 2 zones intersect the durability predicate). No trade-off is needed.

The trade-off is forced only when operational assumptions were wrong — two nodes failed when the operator expected one; both regions went down when the operator expected at most one. At that point, the protocol is structurally unable to continue safely without operator input.

The relationship to CAP

The trade-off is the operational face of CAP at the moment of out-of-envelope partition:

  • C over A (stall / preserve correctness) = classical CP system choice.
  • A over C (abandon / preserve availability) = classical AP system choice.

The FLP impossibility result says you cannot have all three simultaneously (Consistency, Availability, Partition-tolerance). The availability-vs-data-loss trade-off is where an operator locally selects which corner to give up for this specific partition event, which may be different from the long-run system classification. A system can be "CP in steady state; AP in emergency mode" — CP at first, stall when partition happens, escalate to AP if the partition lasts longer than a tolerance threshold.

  • Fencing + quorum reads (see concepts/durability-vs-consistency-guarantee): a CP-leaning system can fence the losing side before failover, then use quorum reads to ensure the new leader sees every write the old leader ack'd. Costs latency; preserves both durability and availability within the partition-tolerance budget.
  • Forced failover (manual operator action): most production systems offer an abandon-unreachable-nodes-and-promote-this-one escape hatch for the case where stall has gone on too long. This is the trade-off realised as a runbook action.
  • Multi-writer with conflict resolution (concepts/no-distributed-consensus): skip the trade-off entirely by letting both sides make progress independently and reconciling on merge. Not always an option — depends on whether the workload's invariants tolerate concurrent writes.

The symmetry with split-brain

concepts/split-brain is what happens when the protocol implicitly abandons unreachable nodes without enforcing mutual exclusion. The availability-vs-data-loss trade-off is what happens when the operator explicitly makes the abandonment decision with protocol awareness.

Split-brain ≈ uncontrolled abandonment: both sides accept writes, state diverges, reconciliation is manual and hard. Controlled abandonment via the trade-off ≈ one side acknowledges it has lost the minority's writes, the minority knows it lost, reconciliation is bounded.

Seen in

Last updated · 550 distilled / 1,221 read