CONCEPT Cited by 1 source

Event-log-based counter¶

Definition¶

An event-log-based counter stores each counter mutation as an individual event in an event store, and computes the counter's value by aggregating events. This contrasts with an in-place counter that stores only the current value and updates it via CAS or locked increment.

The event log preserves the individual provenance of every increment/decrement — enabling audit, recounting, and reset — but requires a background aggregation pipeline + a bucketed partitioning schema to make reads fast and prevent the event log from becoming a hot partition. Canonical wiki instance: Netflix's Eventually-Consistent counter mode. (Source: sources/2024-11-13-netflix-netflixs-distributed-counter-abstraction)

Trade-offs vs. in-place counters¶

Property	In-place counter	Event-log counter
Write contention on hot keys	High (CAS/lock)	Low (append-only)
Idempotency	Requires external mechanism	Natural via event composite key
Audit / recounting	Infeasible (only current value)	Feasible (events retained)
Read latency	O(1) point-read	Requires pre-aggregation
Reset semantics	Simple overwrite	Requires ordering (LWW + event_time)
Storage cost	Low per counter	Higher; governed by retention

Why not just aggregate on read¶

In the simplest form the event-log is scanned on every GetCount. Netflix's post lists the drawbacks:

Read latency scales with event count for a given counter.
Duplicate work across concurrent readers each computing the same aggregate.
Wide partitions in Cassandra for hot counters (see concepts/wide-partition-problem).
Large data footprint without aggressive retention.

The fix is a background sliding-window rollup that continuously aggregates events into a checkpoint; reads serve the checkpoint + optionally a small real-time delta (see patterns/sliding-window-rollup-aggregation).

Schema requirements¶

For the event log to be a workable counter backing store:

Natural idempotency key via (event_time, event_id, event_item_key) so retries + hedges don't double-count.
Bucketed event-time partitioning (time-bucket + event-bucket columns) to prevent wide partitions under high throughput.
Descending event-time order so the most recent ClearCount is seen first when scanning the aggregation window.
Retention — the event log isn't the long-term store of record; aggregated counts are moved to a cheaper store for audits, and events age out via TTL.

Enables audit + recounting¶

Two properties pre-aggregation loses:

Auditing — extracting events to an offline system to verify every increment was applied correctly to reach the final value.
Recounting — if increments need adjustment within a time window, the aggregate has already collapsed the individual events; the log preserves them.

Netflix names both as requirements their event-log-based counter must preserve, and notes this is why they rejected the durable-queue-with-stream-processor-aggregation approach — it pre-aggregates and loses auditing + recounting.

Seen in¶

sources/2024-11-13-netflix-netflixs-distributed-counter-abstraction — Netflix's Eventually-Consistent counter is canonical. Events land in TimeSeries with composite-key idempotency; bucketed partitioning prevents wide partitions; background rollup into a Cassandra-backed Rollup Store + EVCache checkpoint; TTL retention on the event log.