Skip to content

PATTERN Cited by 1 source

Dual-Write to Online and Offline Store

Definition

Dual-write to online and offline store is the ingestion-time discipline of a feature-store-shaped or embedding-store-shaped platform writing every new record to both storage tiers — the low-latency online store (KV / vector DB for interactive serving) and the durable offline store (historical repository for analytics, training, backfill, and recovery) — simultaneously, and regardless of ingestion mode.

(Source: sources/2026-01-06-expedia-powering-vector-embedding-capabilities)

Why both, every time

Three operational properties depend on it:

  1. Historical completeness is structural, not best-effort. The offline store is supposed to contain the entire history of the corpus (every embedding / feature value ever produced, with timestamps + metadata). That completeness relies on dual writes at ingest time, not on a batch-copy job after the fact.
  2. The offline store is the restore source. If the online store is corrupted, lost, or has to be rebuilt with a new index / new sharding / new vector-DB engine, the offline store is the authoritative replay log. For that to work, every record in the online store has to also have been written to the offline store.
  3. Training / analytics use the same values as serving. A model trained on offline-store data queries the online store at serving time. Dual writes eliminate the class of "training saw 1.23, but serving saw 1.24 because of a pipeline skew" bugs before they can occur.

Expedia's framing

"Regardless of the method chosen to load data, the service ensures that all embeddings are stored simultaneously in both the online and offline storage systems, providing robust access for various use cases."

The "regardless of the method chosen" is load-bearing — the three ingestion modes (batch materialization, Insert API, on-the-fly generation) all converge onto one dual-write step. See patterns/embedding-ingestion-modes.

A direct corollary named in the same post:

"The seamless integration between the online and offline stores allows users to restore data from the offline store to the online store whenever needed. This can be done based on various scenarios such as embeddings' creation dates, specific time ranges, or more complex SQL queries."

(Source: sources/2026-01-06-expedia-powering-vector-embedding-capabilities)

Implementation shapes

The pattern doesn't mandate a consistency model — several implementation shapes exist, each with different failure modes:

Shape A — synchronous 2-phase dual write

Ingest request writes to offline store + online store in one atomic step; either both succeed or both fail.

  • Pros. Strong consistency; no reconciliation needed.
  • Cons. Tightest coupling; online-store availability now gates offline write. Real 2PC is rare in practice.

Shape B — write-offline-first, materialize-to-online

Write to offline store synchronously; enqueue a materialization to online store; mark the record "visible online" only when materialization lands.

  • Pros. Offline store is the authoritative write destination; online is a derived view. Failures in online propagation are recoverable by re-materialization.
  • Cons. Added latency to visibility on online path.

Shape C — parallel fan-out with reconciliation

Both writes fire in parallel from the ingestion front-end; asynchronous reconciliation catches divergences.

  • Pros. Low-latency. Common in feature-store implementations.
  • Cons. Eventual-consistency window; reconciliation machinery is an operational tax.

The Expedia post does not disclose which shape it uses. Dash's feature store (Dropbox) similarly doesn't disclose. In practice, production platforms tend toward Shape B or C with explicit reconciliation; Shape A is rare at scale.

Restore path as the canonical beneficiary

The reason dual writes pay for themselves operationally is the restore path:

  • Scenario: new index / algorithm. Provision a new online store (or a new vector-DB collection), replay the offline store's history, cut over. No model call; no ingestion-pipeline reroute.
  • Scenario: online-store data loss. Lost region, lost cluster, corruption — offline store is the recovery substrate.
  • Scenario: rollback to a model version. Replay the window of embeddings produced by the prior model from offline into a new online collection.
  • Scenario: experimentation. Spin up an isolated online store seeded from a slice of offline data (creation-date range, SQL predicate) for A/B evaluation without contaminating production.

Every one of these flows reads the offline store as the ground truth and materializes it into a (new or existing) online store — the same primitive the offline → online restore path provides.

Alternative: single-tier with CDC out

An alternative to dual-writing is writing only to the online store and streaming out via change-data-capture to an analytical warehouse. This is a legitimate design, but it makes the online store load-bearing for history, which is a problem if the online store is a purpose-built vector DB or KV store not built to serve historical scans cheaply. Dual writes decouple the online store's load (serve current data fast) from the offline store's load (hold history cheaply) at the cost of two writes per ingest.

When to apply

Apply when:

  • You operate a feature store or embedding platform with distinct online and offline tiers.
  • Backfill / replay from offline → online is a supported operation (not a once-a-year runbook).
  • Multiple model versions, indices, or schemas need to coexist and be experimented with.

Don't apply when:

  • You have only one tier — pick the right storage and skip the duplication.
  • The offline store is a CDC-derived copy that is truly equivalent to the online store (in which case "dual write" is conceptually the same as "CDC" and the distinction is academic).

Seen in

Last updated · 200 distilled / 1,178 read