Skip to content

SYSTEM Cited by 1 source

Expedia Embedding Store Service

Definition

Expedia Embedding Store Service is Expedia Group ML Platform team's centralized embedding platform: a vector-database-backed service exposing standardized APIs for creating embedding collections, loading vector embeddings, and running similarity / hybrid searches across multiple Expedia ML workloads (recommendation, semantic search, and similar).

(Source: sources/2026-01-06-expedia-powering-vector-embedding-capabilities)

Structural shape

          ┌──────────────────────────────────────┐
          │     Clients (Expedia ML services)    │
          └──────────┬───────────────────────────┘
                     │  create-collection / insert /
                     │  on-the-fly embed / search
   ┌─────────────────────────────────────────────┐
   │        Embedding Store Service (API)        │
   │  - similarity search (top-K, index-backed)  │
   │  - hybrid search (vector + metadata filter) │
   │  - on-the-fly embedding gen (calls models)  │
   └──┬───────────────────────────────────┬──────┘
      │                                   │
      ▼                                   ▼
  ┌─────────┐                      ┌─────────────┐
  │  Feast  │ ── registers ──→      │  online store (vector DB,
  │(metadata│    collections +       │  interactive similarity)
  │ layer)  │    service + model     └─────────────┘
  │         │    + version              ▲   simultaneous
  │         │                           │   write on ingest
  └─────────┘                      ┌────┴────────┐
                                   │ offline store (historical
                                   │ dataset repository)
                                   └─────────────┘

Ingestion modes (see patterns/embedding-ingestion-modes):

  1. Batch — Feast materialization over Spark, pulling from one or more offline sources.
  2. Insert API — real-time or small-batch writes.
  3. On-the-fly embedding generation — the service itself invokes named models to produce embeddings.

All three land in both online and offline stores simultaneously (patterns/dual-write-online-offline).

Feast as the metadata layer

The Embedding Store uses systems/feast — typically sold as a feature store — as the metadata + discoverability layer for embedding collections. This is a non-obvious repurposing: Feast's declarative definitions + adapter ecosystem apply cleanly to embedding collections and not only to feature views. Each collection carries:

  • Associated service — the system / application that generates and/or consumes the embeddings.
  • Model + version — which embedding model (and version) produced the vectors.
  • Schema / index / algorithm metadata — underpins version management across models or index algorithms.

Named benefits of attaching this metadata:

  1. Data consistency — every embedding in a collection is pinned to the model and service it was generated for; no cross-version contamination.
  2. Search and discoverability — ML teams can locate existing collections by model / version / consuming service instead of re-embedding the same corpus.
  3. Version management — multiple collections for the "same" dataset can coexist with different embedding models, index algorithms, or schemas, preserving lineage while supporting experimentation.

(Source: sources/2026-01-06-expedia-powering-vector-embedding-capabilities)

Online / offline duality

Inherited from Feast's online/offline split — applied here to vectors, not numerical features:

Tier Role Workloads
Online store (vector DB) interactive low-latency similarity search on current data recommendation systems, semantic search
Offline store complete historical record of embeddings + metadata analytical queries, experimentation, model training, backup

The post calls out an explicit restore path from offline → online: selectively repopulate the online store from the offline store based on embedding creation dates, specific time ranges, or arbitrary SQL queries. This makes the offline store a first-class recovery and backfill substrate, not a passive archive.

Both tiers receive every write (see patterns/dual-write-online-offline), so the offline store's historical completeness is structural, not best-effort.

Search surfaces

Two query shapes are exposed:

  • Similarity search — standard top-K nearest-neighbor over the online store's index. The post notes the canonical index trade-off — "the choice of index type depends on factors such as dataset size and the balance between speed and accuracy required" — without naming the structure (HNSW / IVF / DiskANN / etc.).
  • Hybrid search — vector similarity combined with attribute / metadata filters (e.g. price < 100, category = electronics). "Hybrid search makes the queries smarter and more precise by combining the power of vector searches with traditional filtering." This is not the same "hybrid" as [hybrid retrieval of BM25
  • dense vectors](<../concepts/hybrid-retrieval-bm25-vectors.md>) — it's vector-plus-structured-filter, not lexical-plus-dense-fusion.

What the post does not disclose

  • Vector DB engine. Online-store implementation is unnamed (Milvus / Qdrant / Weaviate / Pinecone / OpenSearch k-NN / in-house — not stated).
  • Offline store engine. Unnamed (S3 + Parquet / Iceberg / Delta / warehouse — not stated).
  • Embedding models. No model names, no dimensionality, no distance metrics.
  • Scale. No vector counts, QPS, p50 / p99 latency, or cost.
  • Consumer workloads. Expedia-side product / service names that use the Embedding Store are not disclosed.
  • Consistency model of the dual-write. Whether synchronous, asynchronous best-effort, or a pipelined online-after-offline arrangement is not specified.
  • Pre-filter vs post-filter semantics for hybrid search.

(Source: sources/2026-01-06-expedia-powering-vector-embedding-capabilities)

  • vs Dropbox Dash feature store — Dash is a feature store (numerical/categorical ranking features) on the same Feast substrate but a different domain; shares the online/offline duality, shares Feast-as-orchestration, diverges on payload (feature vectors vs embedding vectors) and on the named dual-write + on-the-fly-generation shape. Dash replaced Feast's Python serving with Go for GIL-avoidance — the Expedia post does not comment on serving implementation.
  • vs Amazon S3 Vectors — S3 Vectors is a managed vector-storage tier (cold, archival, hybrid-tiered with OpenSearch for hot). Expedia's Embedding Store is an application-layer service sitting above some vector DB and a historical store; it owns collection metadata, ingestion modes, and on-the-fly generation. A team could in principle run Expedia's Embedding Store on top of an S3-Vectors-plus-OpenSearch hybrid tier, but the post does not suggest that's what Expedia does.
  • vs Bedrock Knowledge Bases — Bedrock KB packages an embedding-model choice + chunking + a vector store as a managed RAG-focused product. Expedia's service exposes a lower-level platform API (collections, similarity / hybrid search) and is multi-workload, not RAG-specific.

Seen in

Last updated · 200 distilled / 1,178 read