Skip to content

PATTERN Cited by 1 source

Progressive fault injection

Pattern

Progressive fault injection applies canary-deployment principles to chaos experiments: inject faults into a tiny fraction of the system, validate safety, then expand scope incrementally. Each expansion step is gated by automated health checks — if thresholds breach, the experiment halts before customer impact.

Mechanism

  1. Start minimal — affect only 1% of target resources.
  2. Validate — monitor error rates, latency, and availability against stop-condition thresholds set well below SLA limits.
  3. Expand — if stable, increase to 5%, then 10%, then 25%.
  4. Halt on breach — any stop-condition alarm immediately terminates the experiment and rolls back injected faults.

Why it works

By limiting the initial blast radius to a statistically insignificant fraction of traffic, teams gain confidence that experiments reveal weaknesses without causing outages. The progressive nature also produces a gradient of failure tolerance data: if the system survives 5% but fails at 10%, that pinpoints exact capacity margins.

Seen in

Last updated · 547 distilled / 1,605 read