Skip to content

PATTERN Cited by 1 source

ML-predicted TTL

Pattern

Use a lightweight machine-learned model to predict the optimal time-to-live (TTL) for each cached item based on its observable features, rather than applying a fixed TTL or a purely heuristic rule. The model is trained on historical access traces to minimize total cost (memory rental + miss penalty).

Constraints

  • Latency: the prediction must complete within the request path — at billions of requests per second, only nanosecond-scale overhead is acceptable.
  • Interpretability: operators need to understand what the model is doing, especially when it affects cache hit rates.
  • Feature availability: the model can only use features available at cache-insertion time (item size, operation type, miss cost), not future access patterns.

Solution shape

A shallow decision tree (few splits, translatable to a handful of if-else branches in C++) that considers: - Page/item size - Miss cost (latency or I/O cost to re-fetch) - Operation type (read, scan, point lookup)

The tree predicts the optimal TTL that minimizes expected cost under the ski-rental cost model.

Trade-offs

  • Sacrifices marginal accuracy vs deeper models for zero-overhead inference
  • Requires periodic retraining if workload characteristics shift
  • On public traces without application-level features, a simpler per-page historical-best-TTL lookup is used as a fallback

Production evidence

Deployed in Spanner: the decision tree approach combined with ski-rental theory yielded −15.5% memory, +5.5% misses (cheap ones), −5% TCO. (Source: sources/2026-06-25-google-optimizing-cloud-economics-with-linear-elastic-caching)

Seen in

Last updated · 559 distilled / 1,651 read