Skip to content

CONCEPT Cited by 1 source

Best-Effort vs. Eventually Consistent counter

Definition

A distributed counter service can expose its counting use cases as one of two primary modes with explicit trade-offs:

  1. Best-Effort counter — optimised for throughput, latency, and cost; tolerates approximate counts, no cross-region replication on mutations, no consistency guarantees, and no native idempotency. Retries are unsafe — a network retransmit of incr(+1) may increment twice.
  2. Eventually Consistent counter — optimised for accuracy, durability, and idempotency. Accepts a small lag (seconds typically) between mutation and observed count in exchange for correctness under retries + hedges + cross-region replication.

Canonical wiki instance: Netflix's Distributed Counter Abstraction surfaces exactly these two modes at the namespace level + an experimental third (Accurate) mode that adds real-time delta computation on top of the Eventually Consistent checkpoint. (Source: sources/2024-11-13-netflix-netflixs-distributed-counter-abstraction)

Why the taxonomy is load-bearing

Counter use cases split cleanly by tolerance for approximation:

  • A/B experiments running concurrently for short durations with many counters: "approximate count is sufficient." Best-Effort.
  • Feature-impression tracking, analytics fact-tables, billing counters: must be accurate, durable, cross-region visible, idempotent. Eventually Consistent.

Netflix's framing makes the split explicit so users pick a mode rather than negotiating per-counter consistency knobs:

"When it comes to distributed counters, terms such as 'accurate' or 'precise' should be taken with a grain of salt. In this context, they refer to a count very close to accurate, presented with minimal delays."

Implementation shapes

Best-Effort (EVCache-only)

counterCacheKey = <namespace>:<counter_name>

// add
return delta > 0
    ? cache.incr(counterCacheKey, delta, TTL)
    : cache.decr(counterCacheKey, Math.abs(delta), TTL);

// get
cache.get(counterCacheKey);

// clear — all replicas
cache.delete(counterCacheKey, ReplicaPolicy.ALL);

EVCache's native incr/decr give extremely high throughput within a region; cross-region replication is per-key-set-only (not increment), and there is no native idempotency — retries are unsafe.

Eventually Consistent (event log + background rollup)

See patterns/sliding-window-rollup-aggregation and the Counter system page for the full shape. At the conceptual level: persist each mutation as an idempotent event in an event store; aggregate in the background within an immutable window; cache the rolled-up count for point-read serving; propagate cross-region via the event-store replication.

Probabilistic data structures are a different axis

HyperLogLog counts distinct elements (cardinality), not increment/decrement per key. Count-Min Sketch can adjust per-key values but doesn't natively support per-key resets + TTLs — those would require additional data structures on top. Netflix's Counter post explicitly names both + rejects them on the grounds that EVCache already meets Best-Effort requirements with less operational surface, and Eventually Consistent requirements go beyond what sketches can cleanly provide.

Seen in

Last updated · 319 distilled / 1,201 read