CONCEPT Cited by 3 sources

active-active)¶

Definition¶

The canonical AWS-lineage disaster-recovery ladder: four tiers ordered by cost, complexity, and recovery time. Picking a tier is a choice of where to trade ongoing cost for lower RTO/RPO.

Tier	Secondary state	Cost	RTO/RPO	Cross-partition fit
Backup and restore	Nothing running, periodic backup copies	Lowest	Hours–days	Second-partition backup bucket is feasible with manual copy tooling
Pilot light	Data tier replicated, compute tier stopped; built-up only when needed	Low	Minutes–hours	Strong fit — duplicate infra cost dominates cross-partition budget
Warm standby	Full-stack running, smaller scale	Higher	Seconds–minutes	Needs more complex cross-partition network + data sync
Multi-site active-active	Full parallel production	Highest	Effectively zero	Most complex network synchronization across partitions

(Source: sources/2026-01-30-aws-sovereign-failover-design-digital-sovereignty)

Why the ladder matters for cross-partition design¶

The AWS Sovereign Failover post is explicit that the same ladder applies across partition boundaries as across regions — only the mechanics change: "Standard cloud resilience models range from simple backups to multi-site setups, and can be implemented across multiple Availability Zones as well as multiple Regions. The same concept equally applies across multiple partitions."

What makes the partition-axis version more expensive at each tier:

Backups need external tooling (S3 Cross-Region Replication doesn't work across partitions).
Pilot light needs separate identities, PKI, and custom data-synchronization to the replicated data tier.
Warm standby adds the same network synchronization problem continuously instead of at failover time.
Active-active needs cross-partition traffic shaping, which has to be built (no Route 53 cross-partition health checks, no Global Accelerator across partitions).

"Finally, warm standby or multi-site active-active setups mainly differ in the need for more complex network synchronization across partitions." (Source: sources/2026-01-30-aws-sovereign-failover-design-digital-sovereignty)

Pilot light as the sweet spot for cross-partition¶

The post gives pilot light a specific endorsement for cross-partition use: "We can run an application pilot light in another partition. This greatly reduces the cost of the infrastructure required in the second partition because it will only be built up when needed."

Reasons it's the cross-partition default:

Second partition's steady-state spend is just the replicated data tier, not compute.
Duplicate IaC is tested periodically (via DR drills) rather than continuously, so the "infrastructure drift" risk is on a weekly / monthly cadence rather than a real-time one.
Matches the typical cross-partition demand curve — rare, discrete failover events driven by digital- sovereignty shifts, not minute-scale AZ failures.

Disaster taxonomy → tier selection¶

The post names three disaster classes, each pushing you toward a different answer:

Natural — regions in different geographic zones / features; any tier can be in-partition.
Technical — independent parts of the global technical infrastructure (power grids, networks); any tier can be in-partition.
Human-driven — political, socioeconomic, legal; this is the class that pushes you across the partition boundary. Paired with the sovereignty framing.

"The choice of Regions depends on the type of disaster you want to mitigate." (Source: sources/2026-01-30-aws-sovereign-failover-design-digital-sovereignty)

Relationship to this wiki's existing reliability patterns¶

patterns/multi-cluster-active-active-redundancy — Figma's three-EKS-cluster topology is the cluster-level active-active instantiation, one level below the partition-level instantiation. Same shape at different scales.
concepts/active-multi-cluster-blast-radius — same blast-radius reasoning applied to clusters instead of partitions.

The pattern is self-similar: pick an isolation boundary (AZ / region / cluster / partition), pick a DR tier, pay the cost.

Seen in¶

sources/2026-01-30-aws-sovereign-failover-design-digital-sovereignty — names the full ladder; endorses pilot-light for cross-partition by default; names the incremental cross-partition cost at each tier.
sources/2026-03-31-aws-streamlining-access-to-dr-capabilities — quantifies the ladder with AWS-native primitives + RPO/RTO numbers: backup-and-restore via AWS Backup (hours-to-days RPO/RTO); pilot light + warm standby via AWS DRS (seconds RPO, 5–20-min RTO with continuous block-level replication); introduces the cross-Region vs cross-account orthogonal isolation axes on top of the ladder.
sources/2026-05-20-aws-cyber-resilience-on-aws-a-reference-approach-for-recovery-from-ransomware-and-destructive-events — adds the cyber-resilience extension on top of the DR ladder: the recovery primitives (vault / MPA / IRE / validation pipeline / Rebuild-Restore-Rotate / parallel-stages workflow) extend the general DR primitives to handle the case where the source environment is no longer trusted. Cyber resilience is not a new tier in the ladder; it is a layered extension that adds recovery-time validation + post-compromise account topology on top of any tier. See concepts/cyber-resilience.

Native-AWS-primitive mapping (within a single partition)¶

Tier	Data	Compute	Native primitive
Backup-and-restore	Snapshots / backups	Not running	AWS Backup + EventBridge + Lambda automation
Pilot light	Continuously replicated	Stopped / minimal	AWS DRS (staging); compute instantiated on failover
Warm standby	Continuously replicated	Running at reduced scale	AWS DRS + launched instances
Multi-site active-active	Bidirectional live	Full parallel production	Aurora Global Database / S3 CRR / Route 53 traffic-shift — not covered by a single AWS-Backup/DRS primitive

Full-workload recovery across these tiers (networking, IAM, config translation) is typically packaged by AWS Resilience Competency Partners (e.g. Arpio).

patterns/cross-partition-failover — the containing pattern
patterns/pilot-light-deployment, patterns/warm-standby-deployment — two specific tiers as pattern pages
patterns/multi-cluster-active-active-redundancy — same shape, cluster-level instantiation
concepts/aws-partition — the isolation boundary this ladder is applied across
concepts/digital-sovereignty — the demand