CONCEPT Cited by 1 source
Per-priority AIMD coefficients¶
Definition¶
Per-priority AIMD coefficients is the technique of running AIMD rate loops for multiple traffic classes using different increase and decrease constants per class. Higher-priority classes get:
- Larger additive-increase (they speed up faster in recovery).
- Smaller multiplicative-decrease ratio (they slow down less aggressively under congestion).
The effect: under load, the capacity released by aggressively-shrunk low-priority flows is probed and claimed by the still-large high-priority flows, so overall system throughput is re-allocated toward high-priority traffic automatically, without any explicit inter-class coordination.
Why plain AIMD is the wrong tool for prioritization¶
Standard AIMD with identical coefficients converges to fair shares — that's one of its defining properties (see concepts/additive-increase-multiplicative-decrease-aimd). Fairness is the wrong goal when the business cares that:
- Order confirmations must process on-time (P1, SLO- protected).
- Marketing pushes can wait during peaks (P3, not SLO- protected).
Equal-coefficient AIMD would throttle these identically — so order confirmations suffer just as much as marketing during capacity pressure. Per-priority coefficients break the fairness property intentionally to implement differentiated service.
Example coefficient table¶
From Zalando's 2024 communications-platform post (Source: sources/2024-04-22-zalando-enhancing-distributed-system-load-shedding-with-tcp-congestion-control-algorithm), a 3-priority system:
| Priority | Additive increase | Multiplicative decrease |
|---|---|---|
| P1 (critical) | + 15 |
× 0.80 (−20%) |
| P2 (normal) | + 10 |
× 0.60 (−40%) |
| P3 (bulk) | + 5 |
× 0.40 (−60%) |
Read the rows:
- On a "not congested" tick: P1 rate climbs 3× faster than P3.
- On a "congested" tick: P3 contracts by 60% while P1 contracts by 20%.
Over a sustained load episode with alternating ticks, the gap widens: P1 stays near its pre-episode rate while P3 collapses toward a small fraction of it.
The coordination property that makes this work¶
No throttle reads any other throttle's state. Each event type's AIMD instance applies its own coefficients to its own rate variable, purely locally (Zalando's post is explicit: "there is no coordination between different throttles!"). What makes the prioritization emerge is:
- The shared congestion signal — all throttles see the same "congested / not congested" decision.
- The asymmetric coefficients — same signal, different response per class.
- The shared physical capacity — when low-priority classes shrink, the saved capacity doesn't sit idle; the additive-increase probes of still-large high-priority classes consume it on the next not-congested tick.
This is the same structural property that makes TCP congestion control work across the internet: a shared observable (packet loss on a link) plus local adaptation rules produce a globally coherent allocation.
Tuning knobs and pitfalls¶
- Ratio, not absolute value, is what matters. If you halve all additive values and double the frequency of the congestion signal, behaviour is approximately preserved. Priority-ordering is preserved by relative coefficients.
- Starvation is possible. If P3's multiplicative decrease is harsh enough, it can shrink to 0 and never recover (the additive increase from 0 is slow). A minimum rate floor per class prevents starvation; the Zalando post doesn't explicitly describe this but it's the standard mitigation.
- Too-large P1 additive. If P1 is allowed to grow arbitrarily on no-congestion ticks, one busy P1 period can claim all capacity, making P2/P3 never recover once the load episode ends. Class-specific maxima cap the upper bound.
- Symmetry with the congestion signal. If the signal flaps (e.g. threshold right at the platform's natural latency), all throttles oscillate. Hysteresis or smoothing (EWMA on the signal) is the standard fix.
Seen in¶
- Zalando — Enhancing Distributed System Load Shedding with TCP Congestion Control Algorithm (2024-04-22)
— three priority classes across 1,000+ Nakadi event
types, with increase constants
{15, 10, 5}and decrease ratios{0.8, 0.6, 0.4}. Order-confirmation processing time stays flat through load episodes while commercial-message processing time rises — the coefficient table converted into observed, SLO-aligned latency behaviour in production.
Related¶
- concepts/additive-increase-multiplicative-decrease-aimd
- concepts/critical-business-operation — the business concept that motivates priority classes.
- concepts/service-level-objective — the SLO is the reason P1 is protected.
- concepts/load-shedding-at-ingestion
- patterns/priority-differentiated-load-shedding
- patterns/aimd-ingestion-rate-control