CONCEPT Cited by 1 source
Cyber resilience¶
Definition¶
Cyber resilience is the recovery leg of a three-leg security posture — prevention keeps threat actors out, detection finds them quickly, cyber resilience focuses on recovery: restoring a trustworthy environment when the source environment itself is no longer trusted.
Verbatim from the canonicalising source:
"Cyber resilience is the ability to recover workloads to a known- good state after an adversary has affected the environment. Prevention works to keep threat actors out and detection works to find them quickly. Cyber resilience focuses on recovery: restoring a trustworthy environment when backups, credentials, or parts of the infrastructure can no longer be assumed to be safe." (Source: sources/2026-05-20-aws-cyber-resilience-on-aws-a-reference-approach-for-recovery-from-ransomware-and-destructive-events)
Why cyber resilience is structurally different from generic DR¶
Generic disaster recovery handles fault disasters — a region fails, a power grid drops, a fibre cut takes out connectivity. The recovery substrate (backups, secondary infra, control plane) is trusted — only the primary failed.
Cyber resilience handles adversary disasters where any of the following may have been compromised:
- Production credentials — cannot be assumed clean; rotating every secret is a recovery requirement, not a routine task.
- Production data — the most recent backup may carry the same malware, encrypted files, or modified configurations as production.
- Production infrastructure configuration — the running configuration may have been modified by the attacker; rebuilding from version-controlled IaC is safer than restoring config from backup.
- The recovery path itself — if recovery uses the same credentials, network paths, or accounts as production, it inherits the compromise.
The architectural consequence: recovery cannot trust anything from the source environment by default. This drives every primitive in a cyber-resilience design — separate accounts, deletion-protected vaults, multi-party approval gates, validation pipelines, IaC-driven rebuild, comprehensive credential rotation.
Core architectural primitives¶
Cyber-resilience designs assemble five primitives (see the canonicalising source for the full reference architecture):
- Account isolation — three-account topology (Production / Recovery / IRE) so the recovery surface has no trust path back to the potentially compromised production surface.
- Service-enforced deletion protection — logically air- gapped vault in Compliance mode (or S3 Object Lock) so even root / compromised admin can't shorten retention or delete recovery points within the retention window.
- Multi-party approval — MPA gate before any restore proceeds; recorded in CloudTrail.
- Multi-layer validation — validation pipeline proving the backup is safe to use, not just recoverable. Runs inside the IRE so a tainted restore stays contained.
- Rebuild-Restore-Rotate framework — three-category sorting of what comes from where: infrastructure is code, data is backup, credentials are new.
Recovery-point selection: the most-recent-working copy¶
Generic DR's heuristic ("restore the most recent backup") fails for cyber events because the most recent backup may already carry the adversary's payload. Cyber-resilience selection (compromise- boundary RP selection) is reverse-chronological from before the event boundary: build an investigation timeline, walk backwards from the most recent candidate that predates the earliest indicator, validate, and step further back if validation fails.
This is why retention windows for cyber resilience need to be sized based on detection latency, not just the routine RPO target — "backup retention should include recovery points that predate realistic detection windows in your organization".
Operational prerequisite: exercise the workflow¶
Cyber events are rare, so the recovery muscle memory has to come from drills, not real incidents. The canonicalising source's seven-step starting checklist ends with "Exercise the full workflow, including investigation, validation, rebuild, restore, and cutover, on a regular schedule" — the highest-leverage item, because the rest of the architecture is only useful to the extent it's been exercised.
This composes with concepts/drill-muscle-memory and concepts/chaos-engineering at the cyber-event altitude.
Relation to general DR¶
Cyber resilience extends generic DR rather than replacing it:
| Axis | Generic DR | Cyber resilience extension |
|---|---|---|
| Recovery target | Most recent backup | Most recent backup before event boundary |
| Backup trust | Trusted | Validated by multi-layer pipeline |
| Account topology | Production + secondary | Production + Recovery + IRE |
| Restore authorization | Standard IAM | MPA-gated |
| Credential handling | Restored | Rotated/re-issued |
| Config handling | Restored | Rebuilt from IaC |
| Rebuild surface | Same as source | Isolated environment |
The 2026-03-31 streamlining-DR post canonicalised the general-DR layer; the 2026-05-20 cyber-resilience post canonicalises this extension layer.
Adversary-modelling assumption: the IaC source itself¶
The Rebuild-Restore-Rotate framework's load-bearing assumption is that the IaC source (templates, pipelines, source repositories) wasn't itself the attack target. If it was, "recovery starts further upstream with a trusted copy of source before rebuild can begin" — which is why knowing where your known-good source of configuration lives, and how it is protected, is a recovery design decision worth making in advance, not at incident time.
Seen in¶
- sources/2026-05-20-aws-cyber-resilience-on-aws-a-reference-approach-for-recovery-from-ransomware-and-destructive-events — canonical wiki reference; defines the prevention/detection/cyber- resilience triad; lays out the full reference architecture (three-account topology, vault, MPA, validation, Rebuild-Restore- Rotate, parallel five-stage workflow, event-boundary RP selection, exercise-the-workflow checklist).
- sources/2026-03-31-aws-streamlining-access-to-dr-capabilities — first wiki reference to cross-account-backup as the ransomware/malware isolation axis; the upstream concept this page builds on.
Related¶
- concepts/disaster-recovery-tiers — the general DR ladder cyber resilience extends.
- concepts/clean-room-recovery-account — the parent isolation concept; cyber resilience adds the IRE as a third account.
- concepts/isolated-recovery-environment — the rebuild surface.
- concepts/cross-account-backup + concepts/cross-region-backup — the orthogonal axes cyber resilience composes with.
- concepts/blast-radius — the containment principle.
- concepts/rebuild-restore-rotate-framework — the recovery decomposition framework.
- concepts/multi-layer-restore-validation-pipeline — what validates safe to use.
- concepts/compromise-boundary-recovery-point-selection — what validates which point to use.
- concepts/parallel-recovery-stages — the workflow shape.
- systems/aws-backup-logically-air-gapped-vault — the deletion- protection primitive.
- systems/aws-multi-party-approval — the gate primitive.