CONCEPT Cited by 1 source
Online vs offline feature store¶
Definition¶
An online feature store and an offline feature store are the two complementary faces of the same feature repository, tuned for different access patterns:
| Dimension | Online store | Offline store |
|---|---|---|
| Latency target | Single-digit to low-tens of ms per lookup | Seconds to minutes |
| Throughput target | Low-throughput, high-RPS lookups | High-throughput bulk scans |
| Backing storage | Low-latency KV (DynamoDB-class, ElastiCache-class) | Object storage (S3-class) |
| Data retained | Only the latest valid feature vector per entity | Full historical log (append mode) |
| Primary use | Real-time inference; user-facing interactive endpoints | Batch pipelines; training; archiving; debugging |
| Cost profile | Higher $/byte; optimised for read-hot access | Lower $/byte; optimised for bulk scans |
| Write model | Point-in-time overwrite (or versioned latest) | Append-only log |
Why both are needed¶
- Online consumers (inference endpoints, interactive what-if UI) need sub-50ms lookups of one or a few feature vectors per request.
- Offline consumers (batch inference, training, performance evaluation, debugging) need to scan millions of rows efficiently and preserve history.
Forcing one store to serve both roles is a source of architectural pain — the online store bankrupts you on cold storage costs if you keep history there; the offline store blows your inference latency if you try to serve interactive traffic from it.
Consistency contract¶
If the online and offline stores are fed from different pipelines or have different update semantics, consumers can see drift between them — "the batch predictor saw feature vector v1 yesterday, but the real-time predictor sees v2 now, and both claim to be authoritative."
Zalando's canonical discipline: patterns/online-plus-offline-feature-store-parity — every online write also persists to the offline store, and the same algorithm runs against both paths, so "what-if" interactive results and daily batch results always agree.
Canonical instance (Zalando ZEOS)¶
Zalando ZEOS's inventory optimiser uses SageMaker Feature Store in both modes:
- Online: 10–20 ms read/write per SKU; serves the partner-portal interactive path (user changes an inventory setting → SQS → Lambda → fetch feature vector → run optimiser).
- Offline: S3-backed, append mode; "latency in the order of minutes"; stores daily datapoints and user-triggered feature-vector updates; feeds the daily SageMaker Batch Transform job and long-term archival / debug workflows.
Verbatim:
"While offline feature store optimises for cost efficient high throughput data IO with latency in the order of minutes, online storage is optimised for low-latency, low throughput applications, providing lookup access to only the latest valid feature vectors — either daily generated vectors or the most recent user-triggered updates."
Third substrate — in-process sketching feature store¶
Zalando's 2021-10 benchmark establishes a third
substrate that sits outside the online/offline pair: a
sketching feature store
backed by a Bloom filter that
lives in the serving process's RAM. Where online and offline
feature stores are both externally-hosted (a KV cluster + an
object store), the sketching substrate eliminates the network
hop entirely — composite keys f"{user}^{article}" go into a
bloom filter, and the ranker reconstructs a user's interaction
set by probing the filter for every known article ID.
The benchmark shows ~30× memory reduction at lossless AUC (470 MB sketch vs 15 GB conventional KV feature store; click- prediction AUC 0.7997 vs 0.80) and zero network-hop latency (the conventional store costs 2–10 ms + tail per lookup).
The substrate is scoped — it only wins when:
- Freshness can be batch-cadence (see concepts/feature-store-freshness). In-process state cannot shard writes without rebuilding a distributed database; each serving node must absorb 100 % of write traffic for incremental updates.
- Feature vocabulary is bounded and enumerable — unbounded spaces like bag-of-words on reviews cannot be enumerated at probe time.
- Individual deletion is not required — sketches cannot delete by construction; expiry is full rebuild.
When freshness is load-bearing (real-time personalisation, engagement features updated on the second), the online/ offline pair remains the right answer. When freshness tolerates hours and the workload has many-lookups-per- request fan-out (ranking 1,000 candidates, composite-key interaction features), the sketching substrate dominates on both memory and latency. See patterns/probabilistic-feature-store-over-kv for the named substitution. (Source: .)
Seen in¶
-
sources/2025-06-29-zalando-building-a-dynamic-inventory-optimisation-system-a-deep-dive — canonical first wiki disclosure of the two-substrate (online + offline) architectural shape.
-
— introduces the third substrate (in-process sketching feature store) and canonicalises freshness as the axis that scopes substrate selection.
Related¶
- systems/aws-sagemaker-feature-store — named vendor product for this shape.
- patterns/online-plus-offline-feature-store-parity — discipline to keep the two stores in sync.
- concepts/proactive-cache-of-batch-predictions — sibling pattern for proactively keeping offline outputs fresh.
- concepts/sketching-feature-store — the in-process third substrate.
- concepts/feature-store-freshness — the axis that scopes substrate choice.
- patterns/probabilistic-feature-store-over-kv — the substitution pattern for the sketching substrate.
- systems/zeos-replenishment-recommender
- companies/zalando