Skip to content

CONCEPT Cited by 1 source

Short-term vs long-term optimization

Definition

Short-term vs long-term optimization is the fundamental tension in decision systems where maximizing immediate metrics (click-through, conversion, engagement) can undermine sustained outcomes (user satisfaction, retention, lifetime value).

Characteristics of the tension: - Short-term reward models are trained on immediate outcomes (minutes to hours after action). They excel at near-term engagement but miss cumulative effects. - Long-term harm (fatigue, churn, trust erosion) only surfaces over extended timeframes — invisible to single-action evaluators. - Without explicit long-horizon mechanisms, greedy optimization dominates and the system exploits users.

Resolution approaches: - Hierarchical policies: slow strategic layer manages long-term constraints; fast layer optimizes within those bounds (see patterns/slow-fast-hierarchical-policy) - Explicit cost terms: universal per-action costs that prevent degenerate high-frequency policies - Multi-horizon reward functions: combining immediate and delayed signals with appropriate discounting

Seen in

Last updated · 546 distilled / 1,578 read