Skip to content

PATTERN Cited by 1 source

Aggregating buffered logger

Pattern. A library (or service) that emits very high-frequency events keeps an in-process aggregating counter map keyed on the event's identifying tuple, and a background thread periodically flushes the accumulated counts through the standard ingestion pipeline. Per-event overhead collapses from a network/serialisation call to a concurrent-map counter increment; data-volume at ingestion collapses from O(events) to O(unique tuples per flush interval).

Canonical wiki instance from Meta's 2024-12-02 cryptographic monitoring post, where FBCrypto implements this pattern to emit "roughly 0.05 % of Meta CPU is X25519"-worthy cryptographic telemetry without per-operation logging costs, and without sampling.

Shape

client thread ─┐
client thread ─┼─► [ folly::ConcurrentHashMap<Tuple, Count> ]
client thread ─┘            ▲
                            │ increment on every event
                            │ read + clear on interval
                    [ background flush thread ]
                    [ ingestion backend (Scribe, Kafka, etc.) ]
                    ┌───────┴───────┐
                    ▼               ▼
              warm store       cold store
              (Scuba)          (Hive)

Load-bearing components

  1. Write-heavy multithreaded map. Must tolerate concurrent increments from every request-path thread without becoming the bottleneck. Meta uses folly::ConcurrentHashMap"built to be performant under heavy writes in multithreaded environments, while still guaranteeing atomic accesses."
  2. Aggregation key. Event-identifying tuple — at Meta for cryptographic monitoring this is (key name, method, algorithm, plus "more fields than just key name, method, and algorithm"). Cardinality must stay bounded relative to raw event rate; see concepts/derived-key-aggregation for the canonical cardinality-control discipline.
  3. Background flush thread. One per process; reads the map, serialises entries as ingestion events including the count, clears the map, sleeps until the next interval.
  4. First-flush jitter. Per-host random delay before the first flush — see patterns/jittered-flush-for-write-smoothing. Mandatory to prevent cohort-synchronised write spikes when many hosts start together.
  5. Shutdown-flush. A final synchronous flush on process exit drains remaining counts. Requires a singleton primitive (Meta: systems/folly-singleton) that preserves ingestion-framework availability during the shutdown sequence.

When to apply

  • Event rate is high relative to per-event logging cost — tens of thousands of events per second per process, or higher.
  • Aggregation-key cardinality is low relative to event rate — per-flush row count much smaller than per-event count.
  • Full-population fidelity is required — sampling is unacceptable because rare call-sites matter (inventory, migration scoping, per-key overuse alarms).
  • Per-event timing / context is not required at the analysis layer — counts + occasional aggregates suffice.
  • Event loss on crash is tolerable — trend-level accuracy over minutes, not per-event durability, is the requirement.

When not to apply

  • Per-event context (timing, payload, tags beyond the aggregation tuple) is required for analysis — use sampling-based trace telemetry instead.
  • Aggregation-key cardinality is high — no meaningful compression; the pattern degenerates to per-event logging with added buffer complexity.
  • Per-event durability is required (billing events, audit logs) — use synchronous write-ahead logging or transactional outbox patterns instead.

Variants

  • Sum + count + min/max/p99 per bucket instead of only count, when distribution metadata is needed. Cost shifts from O(1) per event to O(log N) for approximate quantile structures.
  • Per-thread local maps merged at flush time, when concurrent- map contention on the hot path is itself the bottleneck. Standard next optimisation when the shared ConcurrentHashMap saturates.
  • Time-windowed buckets in the map's value to preserve intra-flush-interval temporal resolution — trade more key-space for sub-flush-interval visibility.

Seen in

Last updated · 319 distilled / 1,201 read