Skip to content

CONCEPT Cited by 1 source

Crash-consistent replication

Definition

Crash-consistent replication produces recovery points whose state is equivalent to what an unplanned crash + reboot of the source system would leave on disk. The replica captures the block-device state at a given instant without any application coordination.

It is strictly weaker than application-consistent replication (which requires the application to quiesce writes, flush in-memory state, and emit an agreed-consistent snapshot) but is achievable continuously without agent/app cooperation — and is sufficient for most modern workloads designed to survive unplanned crashes.

Why it is the right point on the consistency spectrum for DR

Modern storage layers (journalled filesystems, write-ahead-logged databases, transactional KV stores) are engineered with a crash- consistency assumption: on reboot, they replay the log / rollback uncommitted txns / repair the filesystem and come up in a valid state. A crash-consistent replica is exactly the input those recovery procedures were designed for — so restoring from it looks, to the application, like a crash+reboot in the primary.

Application-consistent replication is stronger but either:

  • intrusive (requires quiesce / flush hooks, can block writes),
  • cooperative (only works with apps that participate — VSS on Windows, app-specific commit-barrier APIs), or
  • coarse-grained (snapshot cadence, not continuous).

Crash-consistent replication is agnostic — works for any block-storage workload — and is the substrate that makes continuous replication (seconds-scale RPO) viable.

Canonical AWS primitive

AWS Elastic Disaster Recovery (AWS DRS) provides "a nearly continuous block-level replication" capable of "a crash-consistent recovery point objective of seconds, and a recovery time objective typically ranging between 5–20 minutes." (Source: sources/2026-03-31-aws-streamlining-access-to-dr-capabilities)

The combination is what distinguishes DRS from snapshot-based DR:

  • Snapshots (AMIs, EBS snapshots, DB-level snapshots via AWS Backup) are crash-consistent too, but scheduled — RPO = snapshot interval (minutes to hours).
  • DRS is crash-consistent continuous — block changes stream as they happen, RPO = seconds.

Seen in

Last updated · 200 distilled / 1,178 read