Skip to content

SYSTEM Cited by 1 source

AWS Fault Injection Service

What it is

AWS Fault Injection Service (FIS) executes controlled chaos experiments with built-in safety mechanisms on AWS infrastructure. It provides pre-built actions covering common failure scenarios and supports progressive scope expansion with automated stop conditions.

Key capabilities

  • Pre-built fault actions: terminate EC2 instances, inject network latency, throttle API calls, fail over RDS databases, disrupt AZ connectivity.
  • Progressive scope expansion: experiments start at minimal scope (1% of resources) and expand incrementally (1% → 5% → 10% → 25%) based on validation results.
  • Stop conditions: CloudWatch alarm-based guardrails halt experiments before SLA violation (recommended: trigger at 10× margin below SLA threshold, e.g., 0.1% error rate when SLA is 1%).
  • Experiment templates: reusable, shareable via AWS Organizations across accounts and OUs.
  • Scenarios Library: pre-built common failure patterns for teams without chaos engineering expertise.

Design philosophy

FIS follows the progressive deployment strategy from the Well-Architected Reliability Pillar — analogous to canary deployments where changes roll out incrementally to limit blast radius. The service pairs with CloudWatch for real-time monitoring and automatic rollback (Source: sources/2026-06-22-aws-architecting-ai-powered-resilience-framework-on-aws).

Seen in

Last updated · 547 distilled / 1,605 read