Skip to content

SYSTEM Cited by 1 source

ZEOS Demand Forecaster

ZEOS Demand Forecaster is the weekly batch probabilistic-forecast pipeline inside the ZEOS Inventory Optimisation System. It produces a 12-week-ahead probabilistic forecast per (article_id, merchant_id, week) for 5 million SKUs (size + colour granularity) by training on 3 years of sliding-window history — full pipeline end-to-end under 2 hours.

Pipeline shape

Three stages (right-to-left per the post's Figure 2):

1. Feature Engineering

Split into two complementary tiers per the data pre-processing vs transformation split:

Data pre-processing layer — model upstream data into a human-understandable time-series representation, enabling easier validation and analysis.

  • Tools: PySpark + Spark-SQL on Databricks transient job clusters writing to Delta Lake.
  • Operations: joins, filters, aggregations.
  • Window: 2.5-year timeframe — "enough seasonal patterns without overemphasising older historical performance."
  • Scales horizontally; the volume grows linearly in SKUs × history.

Data transformation layer — engineer features that maximise predictive signals for model training.

  • Tools: Pandas, scikit-learn, NumPy, Numba inside a SageMaker Processing Job.
  • Operations: encoding, normalisation, etc.
  • Scales vertically because scikit-learn / NumPy / Numba lack native distribution support.

Key transformations:

  • Deriving historical demand from sales + stock / availability data.
  • Pricing information: initial and discounted prices at weekly granularity.
  • Article metadata (category, colour, material, …).
  • Unique identifier per time-series: (article_id, merchant_id) tuple — see concepts/skus-as-time-series-unit.

Forecasting-specific features (target lags / transformations, exogenous feature lags / transformations, temporal features) are not implemented in-house — handed off to Nixtla's MLForecast, which uses Numba under the hood.

2. Model Training + Predictions

Rationale from the post:

"After extensive experimentation with deep learning models like TFT and other machine learning approaches, we selected the LightGBM model integrated with Nixtla's MLForecast interface as the foundation of our demand forecasting pipeline."

And on the train+infer collapse:

"Due to the ML model's lightweight training footprint, we bypass complexity, like for example not needing checkpointing, or separate infrastructure for inference. Instead, model training as well as model inference are executed in a single pipeline using AWS SageMaker Training Jobs."

3. Post Processing

Scale (verbatim)

"Our weekly forecasting pipeline processes 3 years of historical data for 5 million SKUs (size and colour) using a sliding window approach, and takes less than 2 hours. This high performance pipeline is enabled by a deliberate focus on data model design and I/O efficiency. We maintain a low total cost of ownership while ensuring reliability and scalability guarantees by leveraging zFlow and AWS-native services in our pipeline."

Quantity Value
SKUs 5,000,000 (at size + colour granularity)
Historical input 3 years, sliding window (2.5-year effective for seasonal capture)
Forecast horizon 12 weeks ahead
Forecast cadence Weekly
End-to-end wall-clock < 2 hours
Forecast output per unit Probabilistic distribution (not point estimate)
Keyed by (article_id, merchant_id, week)

Platform substrate

Runs on zFlow; zFlow compiles the pipeline to an AWS Step Functions state machine via AWS CDK-generated CloudFormation. Databricks clusters + SageMaker jobs are launched per-run as dedicated resources so a failure of one run doesn't impact parallel executions — see patterns/transient-databricks-cluster-per-run.

Canonical disclosure

Seen in

Last updated · 501 distilled / 1,218 read