PATTERN Cited by 1 source
Priority-differentiated load shedding¶
Problem¶
Uniform load shedding — "drop N% of incoming work under pressure" — treats all work as equally valuable. Production systems rarely have this property. Some traffic is SLO-protected (order confirmations, auth, payments, incident communications) and some is bulk / deferrable (marketing pushes, analytics ingestion, recommendation refreshes).
Under a shared capacity constraint, uniform shedding:
- Breaks SLOs gratuitously. A 20% shed cuts critical traffic by 20% despite ample capacity for it if the bulk traffic yielded first.
- Discards business value disproportionately. Order-confirmation latency affects customer trust and revenue directly; marketing-push latency rarely does.
- Misses the actual design question. The business question isn't "what's our shed policy?" — it's "whose work gets done first when we're squeezed?".
Solution¶
Run shedding with asymmetric per-class parameters so that, under load, lower-priority classes yield capacity first and most, and released capacity is automatically re-allocated to still-active higher-priority classes.
The canonical realization pairs this pattern with AIMD (patterns/aimd-ingestion-rate-control) but the principle applies to any shedding mechanism:
- Larger budget increases for high-priority classes. P1 recovers capacity faster than P3.
- Smaller budget decreases for high-priority classes. P1 loses less than P3 on each congestion tick.
- Shared congestion signal. All classes see the same "system is overloaded" decision; the asymmetry is in how they react.
- No inter-class coordination. The emergent re-allocation comes from the arithmetic, not from explicit scheduling.
Coefficient table (canonical Zalando instance)¶
From the 2024 Zalando communications-platform post (Source: sources/2024-04-22-zalando-enhancing-distributed-system-load-shedding-with-tcp-congestion-control-algorithm):
| Priority | Additive increase | Multiplicative decrease |
|---|---|---|
| P1 (critical: order confirmations) | + 15 |
× 0.80 (−20%) |
| P2 (normal) | + 10 |
× 0.60 (−40%) |
| P3 (bulk: commercial messages) | + 5 |
× 0.40 (−60%) |
Reading the rows:
- On a not-congested tick: P1 climbs 3× faster than P3.
- On a congested tick: P3 loses 60% of its rate while P1 loses 20%.
Over a load episode with alternating ticks, the gap widens monotonically: P1 hovers near pre-episode throughput while P3 collapses to a small fraction of it. When the load episode clears, P3 recovers slowly while P1 is already running at near-max.
Why the arithmetic re-allocates capacity¶
The system has a shared physical capacity (downstream throughput). When P3's rate shrinks by 60%, the capacity it released doesn't sit idle — the additive-increase probes of still-large P1 classes claim it on the next not-congested tick. Over a few tick cycles:
Initial (fair) : P1 = 100, P2 = 100, P3 = 100
After some congestion ticks:
P1 ≈ 80, P2 ≈ 40, P3 ≈ 20
Total capacity is still ≈140 (was ≈300); the split is now
~57/29/14 instead of 33/33/33.
The system has converted the fairness property of AIMD into a priority-weighted allocation, deliberately breaking fairness because the business doesn't want fairness here.
Prerequisites¶
- A defined priority taxonomy. Discrete classes (typically 3-5), not continuous priorities — the coefficient table becomes unwieldy beyond that. Per-event-type priority assignment maintained by the domain team that owns the event type.
- A shared congestion signal at the shedding point.
- A per-class rate/budget state variable the shedder can mutate independently.
- Per-class floors and ceilings. Otherwise:
- P3 can collapse to 0 and never recover (starvation).
- P1 can grow unboundedly during calm periods and starve P2/P3 once a load episode clears.
Coefficient-table design guidance¶
- Ratio, not absolute magnitude. Doubling all additive constants is roughly equivalent to halving the tick frequency. What matters is the relative magnitudes across classes.
- Ratios that don't collapse quickly. If P1's decrease is 0.95 and P3's is 0.10, the gap between them after one congestion tick is already an order of magnitude — probably too sharp. Zalando's 0.8 / 0.6 / 0.4 is a gentler factor-of-2-ish progression.
- Asymmetric between increase and decrease sides. The increase side is probing (small steps); the decrease side is reacting (large steps). Zalando's table has P1-decrease at 0.8 while P1-increase at +15 — the decrease is more aggressive per tick than the increase. Standard AIMD property.
Anti-patterns¶
- Strict priority scheduling. Always admit P1, then P2, then P3 until capacity exhausted, never mixing. Starves lower classes completely and produces sharp cliffs at the capacity boundary. Priority-differentiated shedding is the soft priority — classes share capacity, weighted.
- Per-request priority override. "Tag this marketing push as P1 just this once." Defeats the priority system; priority should be per-event-type, not per-event.
- Priority inflation. If everything becomes P1, the table does nothing. The canonical defence is operational review of priority assignments and floor/ceiling enforcement per class.
Seen in¶
- Zalando — Enhancing Distributed System Load Shedding with TCP Congestion Control Algorithm (2024-04-22) — canonical wiki instance. Three priority classes spanning 1,000+ Nakadi event types, with the coefficient table above driving per-event-type AIMD throttles inside the Stream Consumer. Production result after ~6 months: "the processing time for order confirmation is relatively stable... commercial messages experience an increase in the processing time. This is acceptable as this is a low priority use case." — the coefficient table turned into observed, SLO-aligned latency behaviour.
Related¶
- concepts/per-priority-aimd-coefficients — the mechanism underlying this pattern.
- concepts/additive-increase-multiplicative-decrease-aimd — the base algorithm.
- concepts/load-shedding-at-ingestion — where this pattern typically lives.
- concepts/critical-business-operation — canonical business primitive naming the P1 class.
- concepts/service-level-objective — the SLO is the reason the priority class exists.
- patterns/aimd-ingestion-rate-control — canonical composition partner.
- patterns/shed-load-during-capacity-shortage — sibling pattern at a different layer.
- patterns/shed-low-priority-under-load — adjacent pattern (binary drop vs graded rate-reduction).
- systems/zalando-stream-consumer