CONCEPT

Token-bucket slow-query limiter¶

A token-bucket slow-query limiter is a rate limiter applied to the instrumentation path that records "slow query" events so that bursts are captured but long-tail overall throughput is bounded. The bucket is refilled continuously; its capacity bounds the number of events admitted during a burst before rate limiting kicks in.

The Insights-specific twist: the bucket limits how many slow-query records VTGate emits to Kafka, not how many slow queries the database executes. It protects the observability pipeline from a storm of slow queries, not the database from its own queries.

(Source: .)

Why a token bucket specifically¶

Rafer Hazen, 2023-08-10: "We limit the number of recorded slow query log events using a continuously refilled token bucket rate limiter with a generous initial capacity. This allows us to capture bursts of slow queries but limit overall throughput. Typically you don't need to see hundreds of examples of the same slow query, so this doesn't detract from the product."

Three load-bearing properties:

Burst-capture: generous initial capacity absorbs the "first N examples" of a slow-query storm — the valuable diagnostic samples.
Long-tail suppression: steady refill rate bounds events-per-second regardless of underlying query volume. A runaway bad query emitting 100k slow-events/s doesn't saturate Kafka / MySQL.
Diminishing-returns framing: "Typically you don't need to see hundreds of examples of the same slow query." The user signal is in the first few samples; more samples are noise.

Interaction with the per-pattern aggregate¶

Aggregate metrics — count, total time, DDSketch latency — are emitted per fingerprint per interval and are independent of the slow-query-event rate limiter. If the limiter drops slow-query events, the per-pattern aggregates still reflect the full traffic (count, total time, percentile sketch unchanged). The limiter only throttles the individual-query detail stream — which is what drives the Notable queries / slow-query-log UI row-by-row.

Comparison to binary sampling (log 1-in-N)¶

Token-bucket is better than uniform N-th-query sampling for this workflow because:

It captures all slow queries during slow periods (when the bucket is full).
It bounds throughput during busy periods (when the bucket is draining).
It avoids the sampling-aliasing bias that uniform sampling introduces on bursty traffic.

Seen in¶

— canonical wiki disclosure. One of three canonical instrumentation-safety primitives (together with async-plus- bounded-buffer publication and per-interval unique- pattern cap).

Token-bucket slow-query limiter¶

Why a token bucket specifically¶

Interaction with the per-pattern aggregate¶

Comparison to binary sampling (log 1-in-N)¶

Seen in¶

Related¶