Skip to content

PATTERN Cited by 1 source

Cross-partition failover

Shape

An architecture that deploys duplicate infrastructure across two or more AWS partitions so a workload can continue operating when the primary partition becomes unavailable — specifically covering human-driven disasters (geopolitical shifts, sanctions, regulatory changes) that regional redundancy inside a single partition cannot protect against.

Four design surfaces each partition-aware:

  1. Deployment tier — pick from backup / pilot light / warm standby / active-active; pilot light is the cross-partition default (second-partition infrastructure only built up when needed).
  2. Network connectivity — exactly three options: internet-over-TLS, IPsec Site-to-Site VPN over internet, or Direct Connect gateway / PoP-to-PoP partner connections.
  3. Identity — federated IdP preferred; IAM roles with trust + external IDs, STS regional endpoints, resource-based policies, cross-account roles via Organizations as alternatives.
  4. PKI — per-partition Private CA; cross-signed root CAs ("double-signed certificates") for authenticated cross-partition mTLS.

Why it's different from cross-region failover

Cross-region inside one partition uses first-class AWS primitives — S3 Cross-Region Replication, Transit Gateway inter-region peering, Route 53 health checks, Global Accelerator. None of these work across partitions. "Environments must be pre-provisioned and kept in sync through internal or external tooling. Without such an architecture, failover between partitions is impractical. Cross-partition architectures make failover possible but require duplicate infrastructure, separate identity systems, and custom data synchronization." (Source: sources/2026-01-30-aws-sovereign-failover-design-digital-sovereignty)

The engineering work moves to: custom data replication between partitions, separate IAM topology, cross-signed PKI, separate Organizations (mandatory for European Sovereign Cloud, optional for GovCloud), separate Transit Gateways / per-partition Route 53 zones / per-partition Config aggregators / per-partition Security Hub instances / per-partition SCPs.

Partition-selection framing

The target-partition choice follows the disaster class:

  • Natural disasters → different geographic zones; still cross-region-within-partition often suffices.
  • Technical disasters → different power grids / networks; cross-region within a partition usually works.
  • Human-driven disasters (political / socioeconomic / legal) → cross-partition is the right axis; specifically, partitions designed for the sovereignty requirement (GovCloud for US public sector, European Sovereign Cloud for EU sovereignty, AWS China Regions for Chinese data-sovereignty law).

Worked cases the post cites

  • Military / defense — connecting GovCloud to commercial for multi-tenant / cross-domain workloads.
  • Emergency response systems — "requiring secure partition isolation combined with unified management (a single pane of glass approach)."
  • Vendor-independence against geopolitical risk — cross-partition as a cheaper alternative to cross-cloud (IaC templates reuse partition-to-partition).

Tradeoffs

  • + Protects against human-driven disasters regional redundancy can't reach; strongest answer to digital-sovereignty demand.
  • + Reuses companies/aws IaC templates, service APIs, and operational muscle memory partition-to-partition.
  • Duplicate infrastructure cost even at pilot-light tier.
  • Separate IAM / PKI / Organizations topology; separate Transit Gateway / Route 53 / Config / Security Hub footprints.
  • Custom data synchronization; no Cross-Region Replication analog.
  • Tooling gaps — Control Tower doesn't manage GovCloud or European Sovereign accounts; some AWS Organizations features unavailable in those partitions.
  • Operational complexity at the PKI layer (cross-signed CAs) in regulated-mTLS scenarios.

Seen in

Last updated · 200 distilled / 1,178 read