Skip to content

SYSTEM Cited by 1 source

AWS Elastic Disaster Recovery (AWS DRS)

Definition

AWS Elastic Disaster Recovery (AWS DRS) is AWS's managed service for continuous block-level replication of server workloads (EC2 instances, on-prem servers, other clouds) into AWS, with recovery orchestration and automated server conversion for failover into a target AWS Region or account.

"AWS DRS provides a nearly continuous block-level replication, recovery orchestration, and automated server conversion capabilities. With these, you [can] achieve a crash-consistent recovery point objective of seconds, and a recovery time objective typically ranging between 5–20 minutes." (Source: sources/2026-03-31-aws-streamlining-access-to-dr-capabilities)

Quantified RPO / RTO

Metric Value
RPO (recovery point objective) Seconds (crash-consistent)
RTO (recovery time objective) 5–20 minutes typical

Strictly better than snapshot-based approaches (AMIs + AWS Backup) which deliver minutes-to-hours RPO/RTO — the data is replicated continuously at the block layer rather than on a schedule, and server conversion is pre-orchestrated rather than provisioned from scratch.

How it works

  • Continuous block-level replication — agents on source servers replicate disk block changes continuously into a staging subnet in the target AWS Region. No application coordination required — replication is filesystem-agnostic and workload-agnostic.
  • Crash-consistent recovery points — captured state is equivalent to what an unplanned crash+reboot would produce. Not app-consistent (that requires application cooperation or quiescing) but sufficient for modern workloads designed for crash resilience. See concepts/crash-consistent-replication.
  • Automated server conversion — on failover, DRS converts replicated data into launchable EC2 instances with correct instance type, networking, and boot configuration for the target region.
  • Target VPC configuration"You can also use AWS DRS to configure your recovery Amazon Virtual Private Cloud (Amazon VPC). So, with the right settings, you can get your EC2 networking to look like your primary environment."

Role in the DR ladder

Covers the pilot-light and warm-standby tiers of the DR ladder:

  • Pilot light: staging area holds the continuously-replicated block data; compute is stopped/minimal; failover launches the instances from the replicated state.
  • Warm standby: instances can be kept running at reduced scale for faster RTO.
  • Backup-and-restore (lower tier): AWS Backup handles this — DRS's continuous replication is the step up when RPO must be seconds rather than hours.
  • Multi-site active-active (higher tier): out of scope — DRS is a failover primitive, not an active-active one.

Scope limitations

DRS covers static EC2-style compute (plus on-prem / other-cloud servers replicated in). It does not natively cover:

  • Auto Scaling-created instances (requires launch-template / ASG recreation logic)
  • Lambda functions (no block storage; function code needs separate backup)
  • ECS / EKS workloads (task definitions / pod specs + persistent volume reattachment needs separate handling)
  • Fargate (serverless — no host state to replicate)

For full-workload recovery across these compute forms plus networking / IAM / configuration, customers either build custom orchestration or use an AWS Resilience Competency Partner such as Arpio which builds on DRS + AWS Backup + native service primitives.

Seen in

  • sources/2026-03-31-aws-streamlining-access-to-dr-capabilities — canonical wiki reference. Quantified seconds-RPO / 5–20-min-RTO numbers; named as the compute-tier DR building block above AWS Backup; scope limited to static EC2-style workloads (modern serverless / container workloads need additional orchestration).
Last updated · 200 distilled / 1,178 read