Skip to content

CONCEPT Cited by 2 sources

RPO / RTO (recovery point / time objectives)

Definition

The two canonical Disaster Recovery budget dimensions:

  • RPO — Recovery Point Objective — the maximum acceptable amount of data (measured in time of loss) between the last recoverable point and the disaster. RPO = "how much work am I willing to lose?"
  • RTO — Recovery Time Objective — the maximum acceptable duration of downtime before the workload is operational again in the recovery environment. RTO = "how long am I willing to be down?"

Both are business-driven budgets — not engineering specifications. They are chosen first, then the DR tier is chosen to meet them.

Order-of-magnitude mapping to DR tiers

The DR ladder is essentially an RPO/RTO-vs-cost trade curve:

Tier RPO RTO Cost
Backup-and-restore Hours (snapshot interval) Hours–days Lowest
Pilot light Minutes–seconds (with continuous replication) Minutes–hours (compute cold-start) Low
Warm standby Seconds Seconds–minutes Higher
Multi-site active-active ~0 (continuous dual-write) ~0 (already serving) Highest

Canonical AWS-primitive RPO/RTO

Primitive RPO RTO Canonical wiki source
systems/aws-backup Hours (schedule-based) Hours (restore time) sources/2026-03-31-aws-streamlining-access-to-dr-capabilities
EBS snapshots / AMIs Snapshot interval (hours) Minutes–hours sources/2026-03-31-aws-streamlining-access-to-dr-capabilities
AWS DRS Seconds (crash-consistent, continuous) 5–20 minutes typical sources/2026-03-31-aws-streamlining-access-to-dr-capabilities

Why both are needed

RPO and RTO can have opposite cost drivers:

  • Cheap RPO, expensive RTO: continuous replication to cold staging — no data loss, long recovery time.
  • Cheap RTO, expensive RPO: warm standby of stateless compute with infrequent snapshots — recover fast, lose more data.
  • Both small: multi-site active-active — pay for both continuous replication and continuously-live secondary — the most expensive tier.

DR sizing almost always starts with "our business can tolerate ≤ X minutes of data loss and ≤ Y minutes of downtime" and works backwards to pick the tier.

RPO/RTO in the cross-partition axis

sources/2026-01-30-aws-sovereign-failover-design-digital-sovereignty applies the same RPO/RTO framing to the cross-partition axis — same tiers, more expensive at each because no cross-partition equivalent of S3 Cross-Region Replication / Transit Gateway / Route 53 cross-Region health checks exists. The RPO/RTO numbers are also the basis for picking pilot-light as the cross-partition default (acceptable RPO/RTO for the discrete sovereignty-driven failover demand profile).

Seen in

Last updated · 200 distilled / 1,178 read