Skip to content

CONCEPT Cited by 2 sources

DDSketch error-bounded percentile

A DDSketch (arXiv:1908.10693) is a streaming quantile sketch that provides relative-error guaranteed percentiles over a stream of numeric values without retaining the raw values. Reported p50 / p95 / p99 are within a bounded relative error of the true quantile, for a memory footprint logarithmic in the value range.

(Source: sources/2026-04-21-planetscale-query-performance-analysis-with-insights.)

Why sketches for percentiles

Observability systems need per-pattern / per-endpoint / per- tenant percentiles, potentially across millions of series. Two naive approaches fail at scale:

  • Retain every value and sort at query time — cost is O(N) memory per series; impossible at N = billions.
  • Sample (keep 1 in K) — biases tail quantiles exactly where precision matters most (concepts/tail-latency-at-scale).

Quantile sketches give a third option: bounded memory, bounded relative error, mergeable across shards / workers / time buckets.

DDSketch mechanism (sketch)

DDSketch buckets values on a logarithmic scale — bucket i covers [γ^i, γ^(i+1)) where γ = (1 + α)/(1 − α) and α is the relative-error bound. Inserting a value increments the bucket count. Quantile queries scan buckets in order to find the one containing the target rank; returned value is within α relative error of the true quantile by construction.

Properties:

  • Mergeable: bucket-counts sum, so sketches from different shards / intervals can be combined without loss of error bounds.
  • Compact: bucket count grows logarithmically in the value range.
  • Rank-preserving for tails: relative-error bound means p99 / p99.9 are reported with the same precision as p50, which is the core property histograms fail to deliver.

Use in PlanetScale Insights

Rafer Hazen, 2023-04-20: "We also send along a sketch of query execution times that allows us to show error-bounded percentiles (e.g., median and p95)." The sketch is emitted per query-pattern fingerprint per interval, merged in the aggregation pipeline (Kafka → ClickHouse per the enhanced-tagging post), and queried at UI time.

This is the structural answer to "how do you give every query pattern a p95 without storing every execution?" on a platform that handles "thousands or even millions of queries per second".

Seen in

Last updated · 470 distilled / 1,213 read