CONCEPT Cited by 1 source
Short-term vs long-term optimization¶
Definition¶
Short-term vs long-term optimization is the fundamental tension in decision systems where maximizing immediate metrics (click-through, conversion, engagement) can undermine sustained outcomes (user satisfaction, retention, lifetime value).
Characteristics of the tension: - Short-term reward models are trained on immediate outcomes (minutes to hours after action). They excel at near-term engagement but miss cumulative effects. - Long-term harm (fatigue, churn, trust erosion) only surfaces over extended timeframes — invisible to single-action evaluators. - Without explicit long-horizon mechanisms, greedy optimization dominates and the system exploits users.
Resolution approaches: - Hierarchical policies: slow strategic layer manages long-term constraints; fast layer optimizes within those bounds (see patterns/slow-fast-hierarchical-policy) - Explicit cost terms: universal per-action costs that prevent degenerate high-frequency policies - Multi-horizon reward functions: combining immediate and delayed signals with appropriate discounting
Seen in¶
- sources/2026-06-19-netflix-thinking-fast-slow-for-a-personalized-notification-system — Netflix's previous single-model system optimized short-horizon metrics and missed cumulative fatigue. The hierarchical Slow/Fast architecture isolates pacing into a strategic layer that manages long-term member health.