CONCEPT Cited by 1 source

Extended (R, s, Q) policy¶

Classical (R, s, Q) background¶

The (R, s, Q) policy is a classical inventory-theory replenishment rule:

R — review interval (check inventory every R time units).
s — reorder point (trigger replenishment when on-hand inventory drops below s).
Q — replenishment quantity (each order is exactly Q units).

Classical (R, s, Q) assumes a steady-state product — demand rate is roughly stable, there's no launch or decay phase, and one fixed (s, Q) pair governs the product across its full life.

Why classical (R, s, Q) fails in fashion commerce¶

Fashion articles have strong lifecycle structure:

Launch phase — demand is expected but the article has zero history; you need to put inventory on shelf before the first sale signal. Classical (R, s, Q) waits for inventory to drop below s before ordering — which never happens if the article hasn't shipped yet.
Decay phase — once an article is being replaced by the next season, continued replenishment causes overstock write-downs. Classical (R, s, Q) keeps ordering as long as on_hand < s regardless of remaining commercial life.

Between these two phases, classical (R, s, Q) would behave sensibly, but both endpoints are wrong by construction.

The extension¶

Zalando's extension adds two lifecycle-aware parameters on top of classical (R, s, Q):

Q₀ — kick-start quantity injected at time t₀ (launch time). Zero-history inventory push that front-loads the article's initial shelf presence.
t_limit — lifecycle cutoff time after which replenishment is suppressed (or zeroed) regardless of inventory position. Protects against decay-phase overstock.

The full policy parameter vector is θ = (t₀, Q₀, s, Q), with t_limit as a decay-phase guardrail. The optimiser searches θ-space per article × merchant over the 12-week DES horizon to minimise the 75th-percentile cost.

Verbatim from the paper announcement¶

"Classic reorder-point policies are often too rigid for the fast-paced world of fashion. We extended the classical (R, s, Q) policy by introducing an initial kick-start quantity (Q₀) and a time-based lifecycle cutoff (t_limit). This allows the policy to be aggressive during a product's launch and conservative as it reaches its decay phase."

Empirical win vs classical families¶

From the paper's baseline comparison (same data + DES + optimiser), GMV uplift vs human baseline:

Policy	GMV uplift
Extended (R, s, Q) (Zalando)	+22.11%
Tuned (s, S)	+13.39%
Periodic base-stock	+12.50%
Myopic Newsvendor	+5.07%

The delta over Tuned (s, S) — the closest classical family — is +8.72pp in GMV uplift. Zalando's framing: "even the Tuned (s, S) policy, which is a common industry standard, falls short because its static thresholds cannot match the responsiveness of our extended (R, s, Q) variables (Q₀ and t_limit) in a high-variance environment."

Why the extension helps¶

Launch phase — Q₀ creates the initial shelf presence that lets the article start selling (and generating real demand data for the forecaster to learn from). Classical (R, s, Q) waits for a reorder trigger that never fires because there's no historical demand signal.
Decay phase — t_limit prevents the policy from dutifully replenishing an article that is commercially dying. Saves holding cost + overstock write-downs.
Steady-state middle — behaves like classical (R, s, Q). The extensions are surgical add-ons, not a replacement.

Tradeoffs¶

Parameter space grows. Classical (R, s, Q) has 3 parameters; the extended version has 5 (t₀, Q₀, s, Q, t_limit). The black-box optimiser has to search a larger space per article × merchant.
Lifecycle-boundary sensitivity. Choosing t_limit too early caps the upside on still-growing articles; too late defeats the point. Zalando uses Monte Carlo to hedge across realisations, but the choice of t_limit distribution prior is load-bearing.
Fashion-specific framing. The extension is motivated by fashion commerce's launch/decay structure. For stable-catalogue commerce (groceries, industrial supplies), the kick-start + cutoff extensions may be unused or parameterised trivially (t₀ = beginning of time, t_limit = end of time).

Canonical instance (Zalando ZEOS)¶

systems/zeos-replenishment-recommender — runs Extended (R, s, Q) over DES via a gradient-free optimiser with a P75 cost objective.
Backtested over ~2M articles × ~800 merchants for a full year (Oct 2023 – Sep 2024); see concepts/computational-backtest.

Seen in¶

sources/2026-01-14-zalando-paper-announcement-replenishment-optimization-extended-rsq — canonical first disclosure. Nature Scientific Reports paper announcement introducing the extension and quantifying its uplift over classical (s, S), periodic base-stock, and Myopic Newsvendor baselines.

concepts/discrete-event-simulation — the simulator that evaluates candidate θ vectors.
concepts/percentile-objective-optimisation — the risk-aware cost aggregation used to pick θ*.
concepts/probabilistic-demand-forecast — the upstream uncertainty distribution the DES samples from.
concepts/monte-carlo-simulation-under-uncertainty — the evaluation mechanism inside the DES.
systems/zeos-replenishment-recommender
companies/zalando