PATTERN Cited by 1 source

Multi-cadence incremental training¶

Pattern¶

Maintain model freshness by running two nested training loops at different cadences: infrequent full retraining passes on a broad data window, and frequent lightweight incremental updates on recent data mixed with sampled history.

Mechanism¶

Low-frequency (periodic): Full pretraining + post-training on a wide historical window. Resets/refreshes base knowledge.
High-frequency (daily): Continue post-training from yesterday's checkpoint on a blend of:
Latest day's data (captures trends, new catalog items)
Sampled subset of historical data (prevents catastrophic forgetting)
New vocabulary tokens (entities, rows) initialized via fallback tokens

Trade-offs¶

Dimension	Benefit	Cost
Full retrain	Corrects drift, rebalances embeddings	Expensive (compute, time)
Daily incremental	Fast adaptation to trends	Risk of overfitting to recency
History mixing	Prevents forgetting	Increases daily training data volume

Seen in¶

sources/2026-06-29-netflix-genpage-generative-homepage-construction — Netflix GenPage uses this to keep a 200M+ parameter recommender fresh without daily from-scratch retraining