Skip to content

CONCEPT Cited by 1 source

Hybrid Search (vector similarity + metadata filter)

Definition

Hybrid search (in the vector-DB sense) is the retrieval primitive that combines a vector similarity search with a structured filter over the vectors' metadata fields, such that results are both semantically similar to the query vector and satisfy attribute predicates (e.g. price < 100, category = electronics, region = "EU", date > 2025-01-01).

(Source: sources/2026-01-06-expedia-powering-vector-embedding-capabilities)

Expedia's framing

"In addition to similarity search, the Embedding Store Service also supports hybrid search, which combines vector similarity search with filtering based on additional fields in the data. This enables queries that not only find similar vectors but also apply conditions, such as 'price < 100' or 'category = electronics', to refine the results. Hybrid search makes the queries smarter and more precise by combining the power of vector searches with traditional filtering."

Terminology warning — two different "hybrid searches"

This is a namespace collision; both usages are standard.

Usage Meaning Canonical wiki page
Hybrid search (this page) vector similarity + structured metadata filter concepts/hybrid-search
Hybrid retrieval lexical (BM25) + dense vector fusion for keyword-aware + semantic recall concepts/hybrid-retrieval-bm25-vectors

They are orthogonal, not synonyms:

  • This page's hybrid combines one retrieval mechanism (vector ANN) with one filter mechanism (attribute predicate). Output is still a top-K ranked by distance.
  • Hybrid retrieval fuses two retrieval mechanisms (BM25 scores + vector distances) into a combined score, usually via reciprocal rank fusion or a learned ranker.

A production system can (and often does) run both — BM25 + dense-vector fusion plus structured attribute filtering layered on the combined candidate set.

Two execution strategies: pre-filter vs post-filter

Pre-filter Post-filter
Order Restrict candidate set to rows matching filter, then do NN within that set Do ANN ignoring the filter, then drop non-matching hits
Recall at top-K High — the filter does not shrink the NN candidate pool past the filtered set Can drop below K if the filter rejects most of the top-K
Latency when filter is very selective Bad if the filtered subset is tiny relative to the ANN index's natural traversal Fast — ANN ignores filter cost
Latency when filter is loose Adds candidate-pruning cost No extra cost
Implementation Filtered-HNSW / filtered-IVF / scan-then-NN if filter is very selective Expand top-K (e.g. 10×), filter, keep K

Production systems often do adaptive selection based on estimated filter selectivity, or a hybrid hybrid — pre-filter on high-selectivity predicates, post-filter on low-selectivity ones. The Expedia post does not disclose which strategy its Embedding Store uses.

What the filter can express

Depends on the vector DB's schema support; typical primitives:

  • Equality / inequality on categorical / tag fields.
  • Range predicates on numeric / date fields (price < 100, created_at > T).
  • Containment for array / set fields (genres CONTAINS "scifi").
  • Boolean compositions (AND / OR / NOT).
  • Sometimes geospatial predicates (distance-from-point, bounding box).

The collection schema pins which fields are filterable — and often which are indexed for filter (attribute indexes distinct from the vector index).

Why hybrid search is a product-critical primitive

Most real recommendation / search workloads cannot return "semantically similar, but not for sale in your country / above your price ceiling / out of stock / past its expiry date". The filter is correctness-critical, not nice-to-have. Pure similarity search without metadata-aware filtering is rarely the user-facing primitive — it's one half of the user-facing primitive.

Named examples from ingested sources:

  • Expedia Embedding Storeprice < 100, category = electronics (this post; travel / e-commerce shape implied).
  • Amazon S3 Vectors — example query shipped with the preview launch: s3vectors.query_vectors(..., filter={"genre": "scifi"}, topK=3).

Design consequences for the embedding store

  • The collection schema must be designed knowing which filter attributes are load-bearing — those become indexed columns, not just stored properties.
  • Attribute freshness can matter more than vector freshness — in_stock changing has to reflect in the filter instantly; embedding can tolerate staleness.
  • Cardinality planning — a tenant_id filter against millions of tenants is a very different shape from a genre filter over a handful of values; index structure should match.

Seen in

Last updated · 200 distilled / 1,178 read