Skip to content

SYSTEM Cited by 1 source

ZEOS Replenishment Recommender

ZEOS Replenishment Recommender is the inventory optimisation pipeline inside the ZEOS Inventory Optimisation System — the downstream consumer of the ZEOS Demand Forecaster's 12-week probabilistic forecasts. It produces replenishment-decision recommendations at per-SKU granularity and delivers them via two synchronous-agreement paths:

  1. Offline / batch — daily recommendation reports for all partners × articles, precomputed.
  2. Online / interactive — partner-portal-driven re-scoring of a subset of SKUs in response to ad-hoc inventory-setting changes.

Both paths use the same optimisation algorithm on the same feature vectors so online "what-if" answers and daily batch recommendations are guaranteed to agree. See patterns/online-plus-offline-feature-store-parity.

Optimisation algorithm

$$Min\ Costs(\theta) = C_{storage}(\theta) + C_{lost\ sales}(\theta) + C_{overstock}(\theta) + C_{operations}(\theta) + C_{inbound}(\theta)$$

(See systems/zeos-inventory-optimisation-system for the full objective expansion.)

The specific gradient-free optimiser family is not disclosed in the canonical source.

Pipeline shape

Feature generation

Same two-tier pattern as the forecaster — see concepts/data-preprocessing-vs-data-transformation-split:

Output: per-SKU feature vector containing:

  • Historical outbound data
  • Inventory states (current stock levels, stock in transit)
  • Inbound volumes
  • Pricing information
  • Article metadata
  • Cost factors
  • Return lead-time weights
  • 12-week probabilistic demand forecasts from the forecaster

Feature store

SageMaker Feature Store in both its modes — see concepts/online-vs-offline-feature-store:

  • OfflineS3-backed, append mode; cold storage use cases (batch pipelines, archiving, debugging); latency "in the order of minutes"; stores both daily datapoints and user-triggered feature-vector updates for long-term data retention.
  • Online — low-latency, low-throughput lookup of only the latest valid feature vector (either the daily-generated version or the most recent user-triggered update); 10–20 ms read/write per SKU; feeds both batch input generation and online serving.

The two-mode storage is explicit in the post:

"While offline feature store optimises for cost efficient high throughput data IO with latency in the order of minutes, online storage is optimised for low-latency, low throughput applications, providing lookup access to only the latest valid feature vectors."

Offline (batch) delivery

  1. Daily SageMaker Batch Transform runs the optimiser across all merchants + articles using the latest inventory setting from the offline feature store.
  2. Post-processing SageMaker Processing Job evaluates optimisation performance, enabling proactive model performance and drift monitoring.
  3. Recommendations stored in S3; a "report generated" notification is published to the respective event stream for downstream consumers.

This is a proactive cache of daily batch predictions — partners always see fresh recommendations without on-demand recomputation when their inputs haven't changed.

Online (interactive) delivery

When a partner changes inventory settings in the partner portal, the following async workflow fires:

  1. Enqueue on AWS SQS — each inventory-setting update becomes a queue message.
  2. AWS Lambda poller — Lambda drains the queue and serves each update request asynchronously.
  3. Feature fetch — for each inventory update, Lambda fetches the feature vector for the relevant SKUs from the online feature store.
  4. Multi-threaded optimisation — Lambda executes the optimisation algorithm with multi-threading parallelism (one thread per SKU or per independent sub-problem; exact parallelism model not disclosed).
  5. Store + notify — optimal predictions stored in S3; a notification is emitted to the backend event stream for the partner portal to surface fresh results.
  6. Side-effect: persist inventory-setting change to the offline feature store — so future offline batch predictions are consistent with the online what-if. This write is the load-bearing mechanism behind online+offline parity.

This is the canonical patterns/async-sqs-lambda-for-interactive-optimisation instance on the wiki.

Dual-mode delivery: the parity invariant

"It's important to note that the inventory optimisation algorithm and input features are synchronised between the two subsystems (online and offline), ensuring consistency across both engines."

The invariant is enforced by:

  • Single algorithm implementation — the same optimiser binary runs in both the Batch Transform job and the Lambda worker.
  • Shared feature-store namespace — both paths read the same feature vectors, and online writes back to the offline store.

Platform substrate

Canonical disclosure

Seen in

Last updated · 501 distilled / 1,218 read