PATTERN Cited by 1 source
Aggregating buffered logger¶
Pattern. A library (or service) that emits very high-frequency
events keeps an in-process aggregating counter map keyed on the
event's identifying tuple, and a background thread periodically
flushes the accumulated counts through the standard ingestion
pipeline. Per-event overhead collapses from a network/serialisation
call to a concurrent-map counter increment; data-volume at
ingestion collapses from O(events) to O(unique tuples per flush
interval).
Canonical wiki instance from Meta's 2024-12-02 cryptographic monitoring post, where FBCrypto implements this pattern to emit "roughly 0.05 % of Meta CPU is X25519"-worthy cryptographic telemetry without per-operation logging costs, and without sampling.
Shape¶
client thread ─┐
client thread ─┼─► [ folly::ConcurrentHashMap<Tuple, Count> ]
client thread ─┘ ▲
│ increment on every event
│
│ read + clear on interval
▼
[ background flush thread ]
│
▼
[ ingestion backend (Scribe, Kafka, etc.) ]
│
┌───────┴───────┐
▼ ▼
warm store cold store
(Scuba) (Hive)
Load-bearing components¶
- Write-heavy multithreaded map. Must tolerate concurrent
increments from every request-path thread without becoming the
bottleneck. Meta uses
folly::ConcurrentHashMap— "built to be performant under heavy writes in multithreaded environments, while still guaranteeing atomic accesses." - Aggregation key. Event-identifying tuple — at Meta for cryptographic monitoring this is (key name, method, algorithm, plus "more fields than just key name, method, and algorithm"). Cardinality must stay bounded relative to raw event rate; see concepts/derived-key-aggregation for the canonical cardinality-control discipline.
- Background flush thread. One per process; reads the map, serialises entries as ingestion events including the count, clears the map, sleeps until the next interval.
- First-flush jitter. Per-host random delay before the first flush — see patterns/jittered-flush-for-write-smoothing. Mandatory to prevent cohort-synchronised write spikes when many hosts start together.
- Shutdown-flush. A final synchronous flush on process exit drains remaining counts. Requires a singleton primitive (Meta: systems/folly-singleton) that preserves ingestion-framework availability during the shutdown sequence.
When to apply¶
- Event rate is high relative to per-event logging cost — tens of thousands of events per second per process, or higher.
- Aggregation-key cardinality is low relative to event rate — per-flush row count much smaller than per-event count.
- Full-population fidelity is required — sampling is unacceptable because rare call-sites matter (inventory, migration scoping, per-key overuse alarms).
- Per-event timing / context is not required at the analysis layer — counts + occasional aggregates suffice.
- Event loss on crash is tolerable — trend-level accuracy over minutes, not per-event durability, is the requirement.
When not to apply¶
- Per-event context (timing, payload, tags beyond the aggregation tuple) is required for analysis — use sampling-based trace telemetry instead.
- Aggregation-key cardinality is high — no meaningful compression; the pattern degenerates to per-event logging with added buffer complexity.
- Per-event durability is required (billing events, audit logs) — use synchronous write-ahead logging or transactional outbox patterns instead.
Variants¶
- Sum + count + min/max/p99 per bucket instead of only count,
when distribution metadata is needed. Cost shifts from
O(1)per event toO(log N)for approximate quantile structures. - Per-thread local maps merged at flush time, when concurrent- map contention on the hot path is itself the bottleneck. Standard next optimisation when the shared ConcurrentHashMap saturates.
- Time-windowed buckets in the map's value to preserve intra-flush-interval temporal resolution — trade more key-space for sub-flush-interval visibility.
Seen in¶
- sources/2024-12-02-meta-built-large-scale-cryptographic-monitoring
— canonical wiki instance. FBCrypto's aggregating buffered
logger on
folly::ConcurrentHashMap+ periodic Scribe flush + per-host first-flush jitter + synchronous shutdown-flush. The whole shape above is disclosed in one post.
Related¶
- concepts/telemetry-buffer-and-flush — the underlying technique.
- concepts/cryptographic-monitoring — the canonical downstream consumer of this pattern at Meta.
- patterns/jittered-flush-for-write-smoothing — the paired discipline that prevents cohort-synchronised flush spikes.
- patterns/unified-library-for-fleet-telemetry — the strategic posture that makes one-library-side instrumentation sufficient to cover the whole fleet.
- systems/fbcrypto, systems/folly-concurrenthashmap, systems/folly-singleton, systems/scribe-meta — Meta's concrete implementation stack.