Skip to content

PATTERN Cited by 1 source

Gateway throttling by dimension

Admission control at a query/request gateway with per-dimension knobs: per-user, per-source, per-IP, and global. Reject requests (or queue with hard bounds) when the gateway is under heavy load, calibrated per dimension so a single misbehaving actor cannot exhaust the gateway's capacity for everyone else.

Why multi-dimensional

A purely global throttle degrades every caller when one caller misbehaves — the classic one-service-kills-the-gateway failure mode. Per-dimension throttling localises the blast radius:

  • Per-user / per-source / per-IP: caps how much one actor can drive the gateway, so an unintentional client-side bug that generates millions of queries throttles the offender rather than the shared infrastructure.
  • Global: still necessary for overall gateway protection when aggregate load exceeds gateway capacity.

The combination means the gateway can protect itself against both single-caller spikes and aggregate load, without over-throttling well-behaved traffic.

Context: unintentional DDoS

The failure mode that motivated this pattern at Meta scale was explicitly named:

"one service unintentionally bombarding the Gateway with millions of queries in a short span, resulting in the Gateway processes crashing and unable to route any queries."

Throttling is the admission-control half of the response; horizontal autoscaling is the elasticity half. Together they produce a gateway that "can withstand adverse unpredictable traffic patterns."

Seen in

Last updated · 517 distilled / 1,221 read