Skip to content

CONCEPT Cited by 1 source

Event-log-based counter

Definition

An event-log-based counter stores each counter mutation as an individual event in an event store, and computes the counter's value by aggregating events. This contrasts with an in-place counter that stores only the current value and updates it via CAS or locked increment.

The event log preserves the individual provenance of every increment/decrement — enabling audit, recounting, and reset — but requires a background aggregation pipeline + a bucketed partitioning schema to make reads fast and prevent the event log from becoming a hot partition. Canonical wiki instance: Netflix's Eventually-Consistent counter mode. (Source: sources/2024-11-13-netflix-netflixs-distributed-counter-abstraction)

Trade-offs vs. in-place counters

Property In-place counter Event-log counter
Write contention on hot keys High (CAS/lock) Low (append-only)
Idempotency Requires external mechanism Natural via event composite key
Audit / recounting Infeasible (only current value) Feasible (events retained)
Read latency O(1) point-read Requires pre-aggregation
Reset semantics Simple overwrite Requires ordering (LWW + event_time)
Storage cost Low per counter Higher; governed by retention

Why not just aggregate on read

In the simplest form the event-log is scanned on every GetCount. Netflix's post lists the drawbacks:

  • Read latency scales with event count for a given counter.
  • Duplicate work across concurrent readers each computing the same aggregate.
  • Wide partitions in Cassandra for hot counters (see concepts/wide-partition-problem).
  • Large data footprint without aggressive retention.

The fix is a background sliding-window rollup that continuously aggregates events into a checkpoint; reads serve the checkpoint + optionally a small real-time delta (see patterns/sliding-window-rollup-aggregation).

Schema requirements

For the event log to be a workable counter backing store:

  • Natural idempotency key via (event_time, event_id, event_item_key) so retries + hedges don't double-count.
  • Bucketed event-time partitioning (time-bucket + event-bucket columns) to prevent wide partitions under high throughput.
  • Descending event-time order so the most recent ClearCount is seen first when scanning the aggregation window.
  • Retention — the event log isn't the long-term store of record; aggregated counts are moved to a cheaper store for audits, and events age out via TTL.

Enables audit + recounting

Two properties pre-aggregation loses:

  1. Auditing — extracting events to an offline system to verify every increment was applied correctly to reach the final value.
  2. Recounting — if increments need adjustment within a time window, the aggregate has already collapsed the individual events; the log preserves them.

Netflix names both as requirements their event-log-based counter must preserve, and notes this is why they rejected the durable-queue-with-stream-processor-aggregation approach — it pre-aggregates and loses auditing + recounting.

Seen in

  • sources/2024-11-13-netflix-netflixs-distributed-counter-abstraction — Netflix's Eventually-Consistent counter is canonical. Events land in TimeSeries with composite-key idempotency; bucketed partitioning prevents wide partitions; background rollup into a Cassandra-backed Rollup Store + EVCache checkpoint; TTL retention on the event log.
Last updated · 319 distilled / 1,201 read