CONCEPT Cited by 1 source
Instantaneous power loss¶
Definition¶
Instantaneous power loss is a zero-notice disaster scenario in which a data center region loses all power supply without any advance warning — no time for graceful shutdown, drain, or pre-emptive replica migration. It represents the worst-case power failure, distinct from planned maintenance windows or events with hours of advance notice (hurricanes, scheduled utility work).
Why It's Architecturally Distinctive¶
Compared to "few hours of warning" failures:
- No drain window — services cannot migrate state or traffic elsewhere before going dark
- Simultaneous cold start — millions of services must bootstrap autonomously when power returns, creating bootstrapping circular dependencies
- Region-wide scope — all co-located buildings sharing common power connectivity fail together (50–60× the scale of a typical sub-regional fault domain at Meta)
- Bounded staleness in detection — other regions detecting that this region is unavailable is a hard problem for asynchronous systems (FLP-adjacent)
Tolerance Requirements¶
Meta draws explicit boundaries between:
- Unacceptable: data loss, permanent DC facilities damage, sustained impact beyond the single affected region
- Tolerable: transient service errors, rack failures within threshold, bounded staleness in routing tables and region-unavailability detection
Only issues that cannot be remediated post-incident within a reasonable MTTR fall outside the tolerance boundary.
Seen in¶
- sources/2026-06-03-meta-lights-out-systems-on-validating-instant-power-loss-readiness — canonical definition and validation methodology