Skip to content

CONCEPT Cited by 1 source

Adaptive rate limiting

Definition

Adaptive rate limiting is a rate-control strategy where the limit (tokens/second, requests/second, or items/second) is periodically recomputed based on current system state — typically workload size, downstream capacity, or observed error rates — rather than being a static configuration value.

Motivation

Static rate limits break in two directions as systems scale: - Too low → SLA violations as population outgrows the fixed budget. - Too high → downstream overload when population shrinks or downstream degrades.

Adaptive limits track the actual need and adjust automatically.

Variants

  1. Population-proportional — rate = f(current_item_count, target_interval). Recomputed periodically. (Cloudflare Security Insights scheduler.)
  2. Error-driven (AIMD) — increase rate when healthy, cut on errors. (TCP congestion control, adaptive retry.)
  3. Latency-driven — reduce rate when p99 exceeds threshold. (Netflix concurrency limiter.)

Seen in

Last updated · 542 distilled / 1,571 read