Skip to content

PATTERN Cited by 1 source

Slow-fast hierarchical policy

Pattern

A slow-fast hierarchical policy structures a decision system into two asynchronous layers:

  1. Slow Policy (Planner): Runs at a low cadence (e.g., weekly). Makes strategic decisions about constraints and targets using long-horizon signals. Writes decisions to a shared store.
  2. Fast Policy (Executor): Runs on every action opportunity (real-time). Makes tactical decisions (e.g., which item to select) within the bounds set by the Slow Policy. Reads the plan from the shared store as a feature.

Why it works

  • Decouples time horizons: Slow layer explicitly manages long-term effects (fatigue, churn) that fast optimization ignores.
  • Stickiness: Members receive consistent experience over the planning window.
  • Independent evolution: Each layer can be retrained, A/B tested, or replaced without touching the other.
  • Clean separation of concerns: "how often" vs. "which one" become independent optimization problems.

Communication mechanism

A low-latency feature store bridges the two policies asynchronously — no synchronous coupling, no shared inference path. The Slow Policy writes strategic intent; the Fast Policy reads it as an input feature.

Seen in

Last updated · 546 distilled / 1,578 read