PATTERN Cited by 1 source

Probabilistic rejection prioritization¶

Pattern¶

Prioritise one client identity over another by configuring per-client rejection ratios that the throttler applies as a dice roll on each check, independent of system-health metrics. A de-prioritised client is rejected with some probability even when the database is healthy; a favoured client has a low (or zero) rejection ratio and gets to check the metric most or all of the time.

Every client still respects the metric-based admission decision when it is consulted. The difference is how often each identity gets to consult it.

The mechanism¶

def check(client_id):
    ratio = rejection_ratio_for(client_id)   # 0.0 .. 1.0
    if random() < ratio:
        return REJECT_BY_RATIO
    return metric_check()   # normal system-health gate

"A client asks the throttler for permission. The throttler can choose to roll a die, and if the result is, say, 1 or 2, flat out reject the request, irrespective of the system metrics. We thus consider a ratio of the requests to be rejected."

"A rejected client will back off, sleep for a while, then try again. The database is therefore less busy, at the expense of pushing back potential client work. But if we can selectively choose to have a high rejection ratio to one client, while having a low (or zero) rejection ratio to a second client, then we've effectively prioritized the second over the first: the first client will spend more time backing off, even if the database metrics are healthy. During such time, the second client will have more opportunity to do its own work."

— Shlomi Noach, Source: sources/2026-04-21-planetscale-anatomy-of-a-throttler-part-3

Why it's safer than exemption¶

The alternative is exemption: skip the throttler entirely for the favoured client. That produces starvation: the exempted client's unthrottled workload can pin the metric above threshold for the whole rest of the fleet.

Probabilistic rejection avoids this because:

"It's important to highlight that both clients still play by the rules: none is given permission to act if the database has unhealthy metrics. It's just that one sometimes doesn't even get the chance to check those metrics."

No client can damage the database with this pattern — every client still respects the metric-gated admission decision when it does check. The prioritisation is purely over access to the check, not over the check's outcome.

Worked example¶

Two clients, replication-lag threshold 5 s:

Client	Rejection ratio
`low-priority-etl`	0.90
`online-ddl-critical`	0.10

In 100 throttler checks per client per second:

low-priority-etl gets 10 metric checks. If replication lag is below threshold, 10 subtasks admitted.
online-ddl-critical gets 90 metric checks. If lag is below threshold, 90 subtasks admitted.

At steady state (metric pinned at threshold, admit rate 1/2 dice-throws on average), the favoured client does ~9× the work of the deprioritised client — without either client ever violating the health gate.

When the ratio applies¶

The pattern deliberately short-circuits the metric check when the die rolls reject. Alternative implementations stack the ratio on top of the metric check (always consult metric, then additionally reject a ratio on green-lights) — same prioritisation effect, different instrumentation.

Failure modes¶

Rate-only is not enough for rogue clients. A client that ignores the throttler response entirely doesn't care about the dice roll. Pattern addresses prioritisation among well-behaved clients, not enforcement.
Hot-row workloads. If the favoured client's workload is very expensive per subtask, it may still pin the metric — lower rejection ratio for that client just means it bounces off the metric-gate more often.
Bursty arrivals. A client that arrives after a long absence passes the dice roll at its full ratio regardless of historical fairness. Pattern is stateless by design.
Starvation is still possible between favoured clients. The pattern keeps low-priority clients bounded in their badness; it does not prevent two favoured clients from colliding.

Compositions¶

+ patterns/deprioritize-all-except-target — the dual framing: set a high global rejection ratio + zero on the favoured identity. Operationally simpler when you have one favoured client and many others.
+ patterns/time-bounded-throttler-rule — attach a TTL to every rejection-ratio rule so incident-time prioritisations can't become permanent policy.
+ concepts/throttler-client-identity — the hierarchical identity scheme makes it practical to express ratios at different levels (this job, this subsystem, all online-DDL).
+ patterns/enforcement-throttler-proxy — in enforcement mode, the dice-roll + active-delay composes into a per-query-delay-ratio lever.

Seen in¶

sources/2026-04-21-planetscale-anatomy-of-a-throttler-part-3 — canonical wiki introduction. Shlomi Noach proposes dice- roll rejection explicitly as the safer alternative to exemption, walking through the worked scenario where ratio-based prioritisation lets both clients respect the health gate while still giving the favoured client more throughput.