Skip to content

CONCEPT Cited by 1 source

Fan-out amplification

Definition

Fan-out amplification is the phenomenon where a single high-level request decomposes into N parallel downstream calls, causing the overall latency to track the slowest of N hops rather than the median. Any shared infrastructure in the fan-out path (load balancers, proxies, sidecars) has its failure modes multiplied by the fan-out factor.

Example

A batch endpoint unpacking 1 request into 100 parallel calls through a shared load balancer: - Each hop adds only ~200μs (negligible individually) - But the batch waits on all 100 — its latency tracks the worst of 100 random samples from the latency distribution - If the shared load balancer has a 1-in-1000 latency spike, the batch hits one approximately every 10 requests

Mitigation Strategies

  • Move routing in-process: Eliminate shared hops on the fan-out path entirely (client-side load balancing)
  • Cap parallel fan-out: Use a FIFO buffer with hard in-flight limits (patterns/fifo-buffer-overload-protection)
  • Retry to different destination: Ensure retries hit a different pod/node to avoid correlated failures
  • Measure and graph the walk distance: Emit bounded-load walk distance to detect when the system approaches its cap

Seen In

Last updated · 559 distilled / 1,651 read