PATTERN Cited by 1 source
Block-level continuous replication¶
Shape¶
Replicate changes at the block-device layer, continuously (not on a snapshot schedule), producing a crash-consistent replica that tracks the source within seconds of each write. Recovery = launch compute against the replicated block state — which boots as if the primary had crashed and come back up.
Three properties jointly:
- Block-level — captures all filesystem / database state below the application layer; application-agnostic, workload-agnostic.
- Continuous — replication runs constantly, not on a snapshot cadence; RPO is seconds, not minutes/hours.
- Crash-consistent — see concepts/crash-consistent-replication — no application quiesce required; the replica is what a crash+reboot would produce.
Why continuous vs scheduled¶
Scheduled snapshots (AMIs, AWS Backup plans, per-service backup cadences) are crash-consistent too but with RPO = snapshot interval. Continuous replication pushes RPO to seconds at the cost of:
- A replication agent (or hypervisor-level intercept) on source,
- A network-bandwidth commitment to the replication target,
- A staging area in the target (replicated-but-not-yet-launched state).
The break-even point vs scheduled snapshots is when RPO requirements move from "minutes acceptable" to "seconds required."
Why block-level vs app-level¶
App-level (logical) replication (database log shipping, streaming CDC, Redis replication) offers stronger consistency guarantees per workload but requires per-workload implementation. Block-level offers a single mechanism that works for every filesystem and every storage-consuming workload — the substrate of modern server-DR primitives.
Canonical AWS primitive¶
AWS DRS is the canonical native implementation: agents on source machines stream block changes to a staging subnet in the target Region; automated server conversion launches EC2 instances from the replicated state; RPO seconds, RTO 5–20 min. (Source: sources/2026-03-31-aws-streamlining-access-to-dr-capabilities)
Tier fit on the DR ladder¶
Maps naturally onto the middle tiers of the DR ladder:
- Pilot light — the data tier is block-level-continuously-replicated into staging; compute is instantiated on failover.
- Warm standby — the replica is also booted at reduced scale, so failover is scale-up rather than cold-start.
Above warm-standby is multi-site active-active, which requires a different replication model (bidirectional, conflict-resolving, app-aware) — block-level continuous doesn't extend there without significant additional machinery.
Tradeoffs¶
- + Application-agnostic — one DR substrate for any EC2-style workload.
- + Seconds-scale RPO without application cooperation.
- + Minutes-scale RTO (5–20 min for DRS).
- − Block-level replication doesn't understand transactional boundaries — you get crash-consistent, not app-consistent.
- − Does not cover: Lambda functions, Auto Scaling logic, ECS task definitions / EKS pod specs, Route 53 / VPC / IAM config — needs full-workload orchestration layered on top (partner products like Arpio or custom tooling).
- − Steady-state cost of continuous replication agents + network
- staging storage — not free like "we can restore from last night's backup."
Seen in¶
- sources/2026-03-31-aws-streamlining-access-to-dr-capabilities — canonical wiki reference. Quantifies the seconds-RPO / 5–20-min- RTO profile; positions block-level continuous as the enabling primitive for pilot-light + warm-standby tiers above backup-and-restore (AWS Backup).
Related¶
- concepts/crash-consistent-replication — the consistency model.
- systems/aws-elastic-disaster-recovery — canonical AWS primitive.
- concepts/disaster-recovery-tiers — which tiers this pattern enables.
- patterns/pilot-light-deployment, patterns/warm-standby-deployment — the tiers where continuous block replication is the enabling data-tier substrate.