CONCEPT Cited by 2 sources
RPO / RTO (recovery point / time objectives)¶
Definition¶
The two canonical Disaster Recovery budget dimensions:
- RPO — Recovery Point Objective — the maximum acceptable amount of data (measured in time of loss) between the last recoverable point and the disaster. RPO = "how much work am I willing to lose?"
- RTO — Recovery Time Objective — the maximum acceptable duration of downtime before the workload is operational again in the recovery environment. RTO = "how long am I willing to be down?"
Both are business-driven budgets — not engineering specifications. They are chosen first, then the DR tier is chosen to meet them.
Order-of-magnitude mapping to DR tiers¶
The DR ladder is essentially an RPO/RTO-vs-cost trade curve:
| Tier | RPO | RTO | Cost |
|---|---|---|---|
| Backup-and-restore | Hours (snapshot interval) | Hours–days | Lowest |
| Pilot light | Minutes–seconds (with continuous replication) | Minutes–hours (compute cold-start) | Low |
| Warm standby | Seconds | Seconds–minutes | Higher |
| Multi-site active-active | ~0 (continuous dual-write) | ~0 (already serving) | Highest |
Canonical AWS-primitive RPO/RTO¶
| Primitive | RPO | RTO | Canonical wiki source |
|---|---|---|---|
| systems/aws-backup | Hours (schedule-based) | Hours (restore time) | sources/2026-03-31-aws-streamlining-access-to-dr-capabilities |
| EBS snapshots / AMIs | Snapshot interval (hours) | Minutes–hours | sources/2026-03-31-aws-streamlining-access-to-dr-capabilities |
| AWS DRS | Seconds (crash-consistent, continuous) | 5–20 minutes typical | sources/2026-03-31-aws-streamlining-access-to-dr-capabilities |
Why both are needed¶
RPO and RTO can have opposite cost drivers:
- Cheap RPO, expensive RTO: continuous replication to cold staging — no data loss, long recovery time.
- Cheap RTO, expensive RPO: warm standby of stateless compute with infrequent snapshots — recover fast, lose more data.
- Both small: multi-site active-active — pay for both continuous replication and continuously-live secondary — the most expensive tier.
DR sizing almost always starts with "our business can tolerate ≤ X minutes of data loss and ≤ Y minutes of downtime" and works backwards to pick the tier.
RPO/RTO in the cross-partition axis¶
sources/2026-01-30-aws-sovereign-failover-design-digital-sovereignty applies the same RPO/RTO framing to the cross-partition axis — same tiers, more expensive at each because no cross-partition equivalent of S3 Cross-Region Replication / Transit Gateway / Route 53 cross-Region health checks exists. The RPO/RTO numbers are also the basis for picking pilot-light as the cross-partition default (acceptable RPO/RTO for the discrete sovereignty-driven failover demand profile).
Seen in¶
- sources/2026-03-31-aws-streamlining-access-to-dr-capabilities — canonical wiki reference; quantifies DRS's seconds-RPO / 5–20-min-RTO; frames the per-tier RPO/RTO tradeoff.
- sources/2026-01-30-aws-sovereign-failover-design-digital-sovereignty — applies the same ladder to cross-partition failover; argues pilot-light as the cross-partition RPO/RTO sweet spot.
Related¶
- concepts/disaster-recovery-tiers — the ladder ordered by RPO/RTO.
- concepts/crash-consistent-replication — the consistency model that makes seconds-RPO feasible without app cooperation.
- systems/aws-backup, systems/aws-elastic-disaster-recovery — the two AWS-native primitives spanning the tier space.
- patterns/pilot-light-deployment, patterns/warm-standby-deployment — specific tier patterns.