Skip to content

SYSTEM Cited by 1 source

Caffeine (Java in-process cache)

Definition

Caffeine (github.com/ben-manes/caffeine) is a high-performance Java 8+ in-process cache library by Ben Manes. It is widely used as the default JVM local-cache primitive — the canonical replacement for Guava's LoadingCache. Caffeine's distinguishing properties are the W-TinyLFU admission policy (better hit rates than LRU for most real workloads) and a rich async-loading API surface that makes stale- while-revalidate trivial at the application altitude.

Features that matter at scale

  • AsyncLoadingCache — values are loaded via a CompletableFuture, letting the cache serve many concurrent misses for the same key with a single downstream fetch (request collapsing).
  • refreshAfterWrite — configurable trailing stale window inside the TTL. A read in the window returns the cached value immediately and triggers a background refresh, so application latency is bounded even on near- expiry hits. PRAPI uses a 60-second TTL with a 15-second stale window — in the last 15 s, a get triggers a background DynamoDB refresh. (Source: sources/2025-03-06-zalando-from-event-driven-chaos-to-a-blazingly-fast-serving-api.)
  • W-TinyLFU admission — tracks frequency of recently- evicted items to decide whether a new item should displace an existing cache entry; significantly better than LRU on scan-resistant workloads.
  • Weighted eviction, size-bounded or time-bounded — individual entries can be weighted by e.g. byte size so the cache eviction respects memory budget rather than entry count.
  • Stats API — hit/miss/load-success counters wired in without requiring a wrapper.

Seen in

  • sources/2025-03-06-zalando-from-event-driven-chaos-to-a-blazingly-fast-serving-api — Zalando's PRAPI (Product Read API) serving tier uses Caffeine's async-loading cache as its local-cache layer on top of DynamoDB. Product payloads are cached as ByteArray rather than ObjectNode graphs to reduce heap pressure and eliminate GC pauses. The 60s / 15s-stale-window configuration is the load-bearing choice that makes PRAPI's sub-10ms P99 possible while still giving the hot-set consistent-hash routing time to refresh near-expiry entries in the background.
Last updated · 501 distilled / 1,218 read