PATTERN Cited by 1 source
Zero injection for sparse counters¶
A transparent fix for Prometheus' silent undercounting of sparse
counters under rate() / increase(). The aggregation tier, on the
first flush of each counter series, emits a synthetic zero sample
with a delayed timestamp instead of the actual running total; the
real first total follows on the next flush. This seeds Prometheus with
the zero baseline its rate() implementation implicitly assumes.
The problem (why this is needed)¶
In Prometheus, a counter data point is a cumulative value from zero.
rate() derives change across consecutive samples. Creation and the
first increment happen in the same call, so if a counter is reset (pod
restart, aggregator restart, scale-down) before a second increment,
rate() sees one sample and has no delta to compute — the first
increment is silently lost.
In StatsD, every flush is a delta in its own right, so the first increment is never lost.
Airbnb found this wasn't rare: their workloads generate many high-cardinality, low-rate counters — e.g. requests broken down by currency × user × region — where individual series increment only a handful of times per day. The edge case is the common case. It blocked migration progress until solved. (Source: sources/2026-04-16-airbnb-statsd-to-otel-metrics-pipeline)
The solution¶
Inside the aggregator (see systems/vmagent):
- On the first flush of a given output counter series, emit a zero (with a timestamp slightly before what would otherwise be the first real sample, to avoid collision).
- On all subsequent flushes, emit the real running total.
Now rate() always has a zero baseline to diff against, so the very
first real increment is captured correctly.
Trade-offs¶
- One-flush-interval lag on the first visible increment. In practice this is irrelevant.
- Small extra sample per counter series on creation.
- Fix lives in the aggregation tier, not per callsite → invisible to app teams, no PromQL hacks, no gauge-instead-of-counter lies.
Why this won over alternatives¶
| Rejected option | Why it failed |
|---|---|
| Pre-initialize all counters to zero at app startup | Can't enumerate label combinations ahead of time. |
| Tell teams to use logs for exact counts | Different system, different latency, breaks alerting UX. |
| Emit gauges instead of counters | Against Prometheus conventions; gauges/counters are the same internally but semantic expectations differ. |
| Pad queries with PromQL hacks | Pushes complexity onto every dashboard/alert owner. |
The centralized streaming-aggregation tier (concepts/streaming-aggregation) was the right single place to fix the semantic gap, illustrating the general principle: solve backend quirks in the pipeline, not in users' queries.
Seen in¶
- sources/2026-04-16-airbnb-statsd-to-otel-metrics-pipeline — Airbnb vmagent tweak: first flush of an aggregated count emits a delayed zero instead of the running total.