Skip to content

CONCEPT Cited by 1 source

Extended (R, s, Q) policy

Classical (R, s, Q) background

The (R, s, Q) policy is a classical inventory-theory replenishment rule:

  • R — review interval (check inventory every R time units).
  • s — reorder point (trigger replenishment when on-hand inventory drops below s).
  • Q — replenishment quantity (each order is exactly Q units).

Classical (R, s, Q) assumes a steady-state product — demand rate is roughly stable, there's no launch or decay phase, and one fixed (s, Q) pair governs the product across its full life.

Why classical (R, s, Q) fails in fashion commerce

Fashion articles have strong lifecycle structure:

  • Launch phase — demand is expected but the article has zero history; you need to put inventory on shelf before the first sale signal. Classical (R, s, Q) waits for inventory to drop below s before ordering — which never happens if the article hasn't shipped yet.
  • Decay phase — once an article is being replaced by the next season, continued replenishment causes overstock write-downs. Classical (R, s, Q) keeps ordering as long as on_hand < s regardless of remaining commercial life.

Between these two phases, classical (R, s, Q) would behave sensibly, but both endpoints are wrong by construction.

The extension

Zalando's extension adds two lifecycle-aware parameters on top of classical (R, s, Q):

  • Q₀kick-start quantity injected at time t₀ (launch time). Zero-history inventory push that front-loads the article's initial shelf presence.
  • t_limitlifecycle cutoff time after which replenishment is suppressed (or zeroed) regardless of inventory position. Protects against decay-phase overstock.

The full policy parameter vector is θ = (t₀, Q₀, s, Q), with t_limit as a decay-phase guardrail. The optimiser searches θ-space per article × merchant over the 12-week DES horizon to minimise the 75th-percentile cost.

Verbatim from the paper announcement

"Classic reorder-point policies are often too rigid for the fast-paced world of fashion. We extended the classical (R, s, Q) policy by introducing an initial kick-start quantity (Q₀) and a time-based lifecycle cutoff (t_limit). This allows the policy to be aggressive during a product's launch and conservative as it reaches its decay phase."

Empirical win vs classical families

From the paper's baseline comparison (same data + DES + optimiser), GMV uplift vs human baseline:

Policy GMV uplift
Extended (R, s, Q) (Zalando) +22.11%
Tuned (s, S) +13.39%
Periodic base-stock +12.50%
Myopic Newsvendor +5.07%

The delta over Tuned (s, S) — the closest classical family — is +8.72pp in GMV uplift. Zalando's framing: "even the Tuned (s, S) policy, which is a common industry standard, falls short because its static thresholds cannot match the responsiveness of our extended (R, s, Q) variables (Q₀ and t_limit) in a high-variance environment."

Why the extension helps

  • Launch phaseQ₀ creates the initial shelf presence that lets the article start selling (and generating real demand data for the forecaster to learn from). Classical (R, s, Q) waits for a reorder trigger that never fires because there's no historical demand signal.
  • Decay phaset_limit prevents the policy from dutifully replenishing an article that is commercially dying. Saves holding cost + overstock write-downs.
  • Steady-state middle — behaves like classical (R, s, Q). The extensions are surgical add-ons, not a replacement.

Tradeoffs

  • Parameter space grows. Classical (R, s, Q) has 3 parameters; the extended version has 5 (t₀, Q₀, s, Q, t_limit). The black-box optimiser has to search a larger space per article × merchant.
  • Lifecycle-boundary sensitivity. Choosing t_limit too early caps the upside on still-growing articles; too late defeats the point. Zalando uses Monte Carlo to hedge across realisations, but the choice of t_limit distribution prior is load-bearing.
  • Fashion-specific framing. The extension is motivated by fashion commerce's launch/decay structure. For stable-catalogue commerce (groceries, industrial supplies), the kick-start + cutoff extensions may be unused or parameterised trivially (t₀ = beginning of time, t_limit = end of time).

Canonical instance (Zalando ZEOS)

Seen in

Last updated · 428 distilled / 1,221 read