PLANETSCALE 2026-04-21

PlanetScale — Anatomy of a Throttler, part 3¶

Summary¶

Shlomi Noach closes his three-part throttler-design series with the client-side axis: who is asking the throttler, why it matters, and how to differentiate between them. Parts 1 and 2 treated all clients as interchangeable. Part 3 names them — argues that clients should identify themselves to the throttler for both observability and operational-control reasons; that the natural control lever is probabilistic rejection (dice-roll rejection ratios per client identity) rather than outright exemption (which produces starvation for every other client); that prioritising one client is best expressed as de-prioritising everyone else; that differential metrics per client is actually exemption in disguise and has the same starvation risk; that any rule which changes throttler behaviour should come with a time bound; and that the cooperative model has a structural hole — rogue or malfunctioning clients can simply skip the check — closed only by a barrier/proxy-shaped throttler such as Vitess's transaction throttler, which actively delays database queries rather than asking the client to back off. Post introduces Vitess's concrete hierarchical client-identity scheme — <uuid>:vcopier:vreplication:online-ddl — as the canonical wiki instance of multi-level identification supporting both specific and categorical rules.

Key takeaways¶

Clients should identify themselves to the throttler. Two load-bearing reasons: observability ("you want to know which operations were being throttled at what time. You want to be able to tell that the daily aggregation ETL job was mostly throttled between 07:00 and 07:25") and operational control ("Is it possible to prioritize one specific job over others? Or perhaps tune one down, or put it entirely on hold for a while? How about prioritizing a category of clients? Such questions can only be answered if we can clearly distinguish between clients.") (Source: sources/2026-04-21-planetscale-anatomy-of-a-throttler-part-3).
Pure exemption starves every other client. "A rogue client might neglect to connect to the throttler and just go ahead and send some massive workload. Or perhaps the throttler has a mechanism with which to exempt requests from a specific client. The end result is the same: all clients play nicely by the rules, but one gets a free pass to operate without limitation." Worked case: exempted client pushes replication lag to minutes, well above threshold; all other clients are continuously rejected by the throttler for the entire duration. "Exemption is risky because it not only blocks operation of other players but can also degrade system performance, going against the very reason for the throttler's existence."
Probabilistic rejection is the safe prioritisation lever. Instead of exempting, roll a die per check — with some configurable rejection ratio — and reject purely on the dice result, regardless of whether metrics are healthy. "A client asks the throttler for permission. The throttler can choose to roll a die, and if the result is, say, 1 or 2, flat out reject the request, irrespective of the system metrics. We thus consider a ratio of the requests to be rejected." Canonical wiki framing for the new patterns/probabilistic-rejection-prioritization pattern.
"Prioritise X" is best implemented as "de-prioritise everyone else." Two equivalent shapes. (a) High rejection ratio on the deprioritised client + low/zero ratio on the favoured client. (b) Low/zero ratio on the favoured client, high ratio globally otherwise. "In another model, one could configure the throttler to reject a ratio of requests for all clients, and then have a lower, or zero rejection ratio for a particular client. Thus, a safe way to prioritize one client over others is to de-prioritize all other clients." Canonical wiki framing for the new patterns/deprioritize-all-except-target pattern. Crucially, both clients still play by the rules — the database health check still gates all clients; what differs is how often they even get to the health check.
Per-client differential metrics are exemption in disguise. "Does it make sense for different clients to throttle based on different metrics? … Looking closely, this is an exemption scenario. While the second client throttles based on replication lag and also on load average, the first client is effectively exempted from checking load average. If that first client's workload is such that it does indeed push load average beyond its threshold, then the second client becomes starved." Mitigated in practice by operator judgement — "the engineer or administrator will be familiar with the type of impact of a specific job" — and by the argument that "if load average were to be high even without the first client, throttling that client may not have any impact at all, so we may as well just let it complete its job."
Exemption is sometimes justified. Three named cases where exemption is acceptable: (1) Short transient starvation — "if a client is starved for 10 minutes out of a total runtime of 12 hours, this may not be a big deal." (2) Incident fix that must run. "If a task absolutely has to run at all costs (e.g., fixing an incident) and that pushes resources beyond what we want to see in normal times, so be it." (3) Essential system components. "If the client is an essential part of the system itself, and goes through the throttling mechanism due to data flow design, and does not handle massive data changes, then we may and should exempt it altogether."
Hierarchical client identity enables specific + categorical rules. Vitess's canonical example: d666bbfc_169e_11ef_b0b3_0a43f95f28a3:vcopier:vreplication:online-ddl. Decomposition: UUID for this specific job / vcopier flow / vreplication subsystem / online-ddl schema migration. "With this identity scheme, it is possible to categorically prioritize (or de-prioritize) all online-ddl jobs, or just this very specific job, or alternatively exempt all vcopier flows entirely. Observability-wise, this makes it easier to analyze throttler access patterns by categories of requests." Canonical wiki instance of the new concepts/throttler-client-identity concept.
Any rule should have an expiry. "Jobs and operations eventually complete. But it's also a good idea to put a time limit on any rules you may have set. If you've exempted a category of clients, then it's best if that exemption expires at some point." Rush-hour suppression and investigation-window blocks are the canonical use cases. Canonical wiki framing for the new patterns/time-bounded-throttler-rule pattern. Failure mode this prevents: stale policy — a rule added during an incident is forgotten and keeps altering production behaviour long after the reason for it is gone.
Rogue clients are the structural hole of the cooperative model. "We've discussed the potential for rogue (or malfunctioning) clients to skip throttler checks. This is a possible scenario in the cooperative design." No amount of client-identity design or ratio-based prioritisation protects against a client that never asks.
Enforcement throttlers close the cooperative-model hole. "An alternative throttling enforcement design puts the throttler between the client and the system. The throttler runs as a proxy, or integrates with an existing proxy, to be able to throttle client requests. Such is the Vitess transaction throttler, which can actively delay database query execution when system performance degrades. Clients cannot bypass the throttler, and may not even be aware of its existence." Canonical wiki framing for the new patterns/enforcement-throttler-proxy pattern + new systems/vitess-transaction-throttler system.
Enforcement has a client-identification cost. "It's more complicated to identify the clients, and the throttler must rely on domain-specific attributes made available by the client/connection/query to be able to distinguish between clients and implement any needed prioritization." The proxy sees queries, not job-intents. The cooperative model gets identity for free; the enforcement model has to reconstruct it from connection attributes, query shape, SQL comments, or upstream signals.
Dynamic control is non-negotiable. "Dynamic control of the throttler is absolutely critical, and the ability to prioritize or push back specific requests or jobs is essential in production systems." Static config is a dead-end — every axis in this post (identity, rate, time bounds, rules) is an operator-adjustable knob at runtime.

Systems / concepts / patterns extracted¶

Systems
- systems/vitess — tablet throttler covered in parts 1 and 2; this post adds the client-identity scheme as an extension and introduces the separate transaction-throttler subsystem.
- systems/vitess-throttler — the cooperative check-API throttler from prior parts, extended here with the hierarchical client-identity scheme.
- systems/vitess-transaction-throttler — new wiki system. Vitess's enforcement-mode throttler that sits in the query path (via VTTablet's connection-pool layer) and actively delays query execution under degraded system performance. Canonical wiki instance of the barrier/proxy-shaped throttler design called out in part 1 but deferred until part 3.
- systems/mysql — the underlying database whose queries the transaction throttler delays.
- systems/planetscale — deployment context; consumes both the cooperative tablet throttler and the enforcement transaction throttler via Vitess.
Concepts
- concepts/throttler-client-identity — new wiki vocabulary. Hierarchical identifier scheme (<uuid>:<flow>:<subsystem>:<job-category>) that lets the throttler distinguish specific jobs from job categories, enabling both fine-grained (this one job) and categorical (all online-DDL jobs) rules.
- concepts/throttler-client-starvation — new wiki vocabulary. The failure mode where one client's unrestricted or low-rejection workload pushes metrics above threshold for long enough that every other client is continuously rejected. Caused by exemption, unidentified rogue clients, or differential metric assignments.
- concepts/throttler-exemption — new wiki vocabulary. The design choice to let specific clients bypass all or some throttler checks. Risky in general (starves others, undermines the throttler's existence) but justified in three named cases: short transient starvation, incident-fix tasks, essential system components.
Patterns
- patterns/probabilistic-rejection-prioritization — new canonical wiki pattern. Per-client rejection ratio applied metric-independently; dice roll on each check rejects a configurable fraction regardless of system health. De-prioritised clients spend more time backing off, favoured clients get more opportunities to check metrics — but every client still respects the metric-based decision when it is consulted.
- patterns/deprioritize-all-except-target — new canonical wiki pattern. Equivalent dual of "prioritise X": set a high rejection ratio globally + low/zero ratio on the favoured client identity. Operationally simpler because one favoured client identity is easier to list than enumerating every other client to deprioritise.
- patterns/time-bounded-throttler-rule — new canonical wiki pattern. Every throttler rule (exemption, prioritisation, de-prioritisation) carries a TTL after which it reverts to baseline. Prevents stale-policy accumulation from incident-response and rush-hour adjustments.
- patterns/enforcement-throttler-proxy — new canonical wiki pattern. Throttler sits in the query path and delays/rejects directly rather than asking the client to respect an answer. Closes the rogue- client hole of the cooperative model. Cost: client identification becomes an inference problem on connection/query attributes rather than a self-reported identity.

Operational numbers¶

Rejection ratio in the worked example = "if the result is, say, 1 or 2, flat out reject" — i.e. a 2/6 ≈ 33% configurable rejection ratio in the dice-roll example. No specific production-tuning guidance given; the shape of the lever is named, the specific values are deployment-dependent.
Starvation-tolerance heuristic = "if a client is starved for 10 minutes out of a total runtime of 12 hours, this may not be a big deal" — ~1.4% starvation share as the qualitative "probably fine" waterline. Not a rule.
Vitess client-identity scheme = 4-level hierarchy (UUID / flow / subsystem / job-category), e.g. d666bbfc_169e_11ef_b0b3_0a43f95f28a3:vcopier:vreplication:online-ddl.
Enforcement-throttler location in Vitess = VTTablet transaction-executor / query-delay layer, described abstractly in the post without code references.

Caveats¶

No production tuning numbers. The post is prescriptive about the shape of the levers (rejection ratio, time-bounded rules, hierarchical identity, enforcement proxy) but deliberately doesn't disclose PlanetScale's production rejection ratios, time-window defaults, or rule-TTL policies.
No starvation-detection mechanism. Noach names starvation as a hazard but doesn't describe how an operator detects it in production (e.g. per-client rejection-rate alarms, long-tail runtime anomalies). Left to the reader.
Transaction-throttler internals undisclosed. The post names the system and its responsibility (active query delay) but doesn't walk its control loop, threshold logic, or integration with VTTablet's connection pool at code granularity. That is a future-ingest surface from Vitess docs / source.
Client-identification inference under enforcement is hand-waved. "Domain-specific attributes made available by the client/connection/query" — concrete mechanisms (SQL comments, session variables, connection tags, auth-scope attributes) are not enumerated.
Probabilistic-rejection pattern conflates two subtly different things. (a) Per-client rejection ratio that short-circuits the metric check (dice roll overrides green-light). (b) Per-client rejection ratio that additionally rejects above green-light. Post describes (a); production implementations often do some blend.
No cross-reference to workload-class resource budgets. The post's probabilistic/de-prioritisation framing and the wiki's existing patterns/workload-class-resource-budget (from the 2026-04-11 PlanetScale-Postgres Traffic Control post) are at different substrate tiers (MySQL/Vitess vs Postgres/PS Traffic Control) but address the same mixed-workload concern. Post doesn't connect them; the connection is made explicit on the relevant wiki pages.
Tier-3 source, scope-disposition on-scope per PlanetScale skip-rules: Vitess-internals content by a Vitess core maintainer is default-include, and the post's architectural density is ~100% of the body (every paragraph advances a client-identity, prioritisation, exemption, or enforcement-shape primitive). Not a product announcement, not marketing copy.

Source¶

Original: https://planetscale.com/blog/anatomy-of-a-throttler-part-3
Raw markdown: raw/planetscale/2026-04-21-anatomy-of-a-throttler-part-3-ed911976.md