Skip to content

CONCEPT Cited by 1 source

Storage replication for durability

Definition

Storage replication for durability is the architectural pattern where data is written to N independent storage targets — typically a primary + (N−1) replicas — so that data loss requires all N to fail simultaneously. Under independent-failure assumptions, this drives the data-loss probability from P to P^N, enabling durable storage on non-durable media (e.g., local NVMe on ephemeral cloud VMs).

Dicken's canonical example

"As a made up value, say in a given month, there is a 1% chance of a server failing. With a single server, this means we have a 1% chance of losing our data each month. This is an unacceptable for any serious business purpose. However, with three servers, this goes down to 1% × 1% × 1% = 0.0001% chance (1 in one million)."

(Source: sources/2025-03-13-planetscale-io-devices-and-latency)

With the addition of automated node replacement and frequent backups, the Dicken argument is that even this conservative 1-in-a-million figure understates real-world durability.

Why it matters in the cloud

The cloud-era default was to pair compute with network-attached block storage (EBS, Persistent Disk) whose storage system internally replicates across hosts. This gave compute instances ephemeral durability — instance dies, data survives — at the cost of a ~5× latency penalty (concepts/network-attached-storage-latency-penalty) and often an IOPS cap (concepts/iops-throttle-network-storage).

Storage-replication-for-durability at the application / database layer turns the trade around: use local NVMe (ephemeral, fast) as the storage medium, and solve the "server dies, data dies" problem at the replication layer instead. A primary + 2 replicas on local NVMe gives you:

  • Local-NVMe latency (~50 μs) for the foreground path.
  • Cross-instance durability via the replication.
  • Automated failover if the primary's drive is lost.

See patterns/direct-attached-nvme-with-replication + systems/planetscale-metal as the canonical instance.

Independent-failure caveat

The P^N math only holds if failures are independent. In practice:

  • Same rack / same power rail → correlated failure, N drops effectively to 1 for that failure mode.
  • Same AZ / region → correlated failure for regional power, cooling, or network events.
  • Same software version → correlated failure on deterministic bugs (all replicas crash on the same input).
  • Same backup / restore window → correlated corruption if the bug writes bad data across replicas before anyone notices.

Real deployments pay down correlated-failure risk with:

  • Across-AZ or across-region replicas — geographic independence.
  • Heterogeneous hardware generations — batch/firmware independence.
  • Staggered rollouts — software-version independence.
  • Frequent backups to a cold tier — time-independent recovery point.

Dicken: "At PlanetScale the protection is actually far stronger than even this, as we automatically detect and replace failed nodes in your cluster. We take frequent and reliable backups of the data in your database for added protection."

Replication models

Model Consistency Latency cost
Synchronous to N replicas Strong Primary commit waits for N acks
Semi-synchronous (N of M replicas) Tunable Wait for quorum
Asynchronous Eventual Primary acks immediately; data may be lost if primary dies before replication

OLTP deployments typically pick semi-sync to 1-2 replicas — durability without waiting for the slowest replica. See concepts/leader-follower-replication + concepts/asynchronous-replication.

Seen in

Last updated · 319 distilled / 1,201 read