PATTERN Cited by 1 source
ML-predicted TTL¶
Pattern¶
Use a lightweight machine-learned model to predict the optimal time-to-live (TTL) for each cached item based on its observable features, rather than applying a fixed TTL or a purely heuristic rule. The model is trained on historical access traces to minimize total cost (memory rental + miss penalty).
Constraints¶
- Latency: the prediction must complete within the request path — at billions of requests per second, only nanosecond-scale overhead is acceptable.
- Interpretability: operators need to understand what the model is doing, especially when it affects cache hit rates.
- Feature availability: the model can only use features available at cache-insertion time (item size, operation type, miss cost), not future access patterns.
Solution shape¶
A shallow decision tree (few splits, translatable to a handful of if-else branches in C++) that considers: - Page/item size - Miss cost (latency or I/O cost to re-fetch) - Operation type (read, scan, point lookup)
The tree predicts the optimal TTL that minimizes expected cost under the ski-rental cost model.
Trade-offs¶
- Sacrifices marginal accuracy vs deeper models for zero-overhead inference
- Requires periodic retraining if workload characteristics shift
- On public traces without application-level features, a simpler per-page historical-best-TTL lookup is used as a fallback
Production evidence¶
Deployed in Spanner: the decision tree approach combined with ski-rental theory yielded −15.5% memory, +5.5% misses (cheap ones), −5% TCO. (Source: sources/2026-06-25-google-optimizing-cloud-economics-with-linear-elastic-caching)
Seen in¶
- sources/2026-06-25-google-optimizing-cloud-economics-with-linear-elastic-caching — shallow decision tree predicting per-page TTL in Spanner at billions of RPS