CONCEPT Cited by 4 sources

Hybrid retrieval (BM25 + dense vectors)¶

Hybrid retrieval is the pattern of combining a lexical index (BM25 / keyword) with a dense vector index (semantic embeddings) in the same retrieval pipeline, so the ranker can exploit exact-term matching and paraphrase/synonym matching at the same time.

Why "hybrid" and not one or the other¶

BM25 alone is superb at exact-term matching, acronyms, proper nouns, and queries where the user already knows the right word. Weakness: paraphrase / synonyms / cross-domain term transfer.
Dense vectors alone are strong at semantic matching and paraphrase robustness. Weakness: can miss exact-term matches that lexical would nail; embedding drift / out-of-domain failure.
Hybrid lets each cover the other's weaknesses. Ranking typically fuses lexical + semantic scores (weighted sum, reciprocal rank fusion, or a learned ranker on top).

BM25 is a workhorse, not a fallback¶

Dash's own framing from sources/2026-01-28-dropbox-knowledge-graphs-mcp-dspy-dash:

"Today we use both a lexical index—using BM25—and then store everything as dense vectors in a vector store. While this allows us to do hybrid retrieval, we found BM25 was very effective on its own with some relevant signals. It's an amazing workhorse for building out an index."

Two production signals in that quote:

BM25 is the primary retrieval surface, not a fallback or a pre-filter. "Very effective on its own with some relevant signals" is a strong claim from a team doing modern agentic RAG.
Dense vectors are additive, not a replacement. Hybrid enables better recall on paraphrase queries without displacing BM25 as the lexical anchor.

This contradicts a common vendor-pitched trajectory where pure vector retrieval is positioned as the successor to BM25. At Dash's scale and domain, BM25 kept its seat.

Typical hybrid pipeline shape¶

Ingest → normalize to text (markdown normalization, text extraction from docs / images / PDFs / audio / video).
Parallel: BM25 indexing + embedding generation.
At query time: parallel retrieval from both indexes, top-K each.
Fusion / re-ranking: scores combined (RRF, weighted sum, or a learned ranker).
Multiple ranking passes applied on top (personalization, ACL filters, knowledge-graph edges — Dash's framing).

Fusion techniques: RRF vs RSF (the two industry standards)¶

MongoDB's 2025-09-30 post (sources/2025-09-30-mongodb-top-considerations-when-choosing-a-hybrid-search-solution) names the two fusion algorithms that "have become standard techniques in the market":

Reciprocal Rank Fusion (RRF) — rank-position-based fusion. No score normalization needed; rewards cross-retriever consensus; standard form ∑ 1/(k + rank_r(d)) with k typically 60.
Relative Score Fusion (RSF) — score-value-based fusion with per-retriever normalization; preserves score-magnitude information; more granular than rank alone; requires normalization-method + weight tuning.

RRF is the default starting point (no calibration, universal defaults work); RSF is adopted when the rank-based information loss costs measurable quality. Other fusion shapes exist in the design space — Figma's min-max-normalized + exact-match-boost + interleave is a specific RSF-family realization.

Industry evolution: vendor-origin architecture bias¶

From MongoDB's 2025-09-30 post, vendor architectural origins predict hybrid-search shape:

Lexical-first platforms (MongoDB, Elasticsearch, OpenSearch, Solr) built around BM25 on inverted indexes; added vector search as a second index type; typically use separate indexes fused at query time. "The main challenge was to add vector search features and implement the bridging logic with their existing keyword search infrastructure."
Vector-first platforms (Pinecone, Weaviate, Milvus, Qdrant) built around dense-vector ANN; added lexical via sparse vectors rather than inverted indexes — "Implementing lexical search through traditional inverted indexes was often too costly due to storage differences, increased query complexity, and architectural overhead. Many adopted sparse vectors, which represent keyword importance in a way similar to traditional term-frequency methods used in lexical search."

MongoDB's architectural framing: "lexical-first systems tend to offer stronger keyword capabilities and more flexibility in tuning each search type, while vector-first systems provide a simpler, more unified hybrid experience." Wiki treats this as one-vendor positioning — the more neutral reading is that the boundary is blurring (Elasticsearch's ELSER emits learned sparse vectors; Pinecone supports hybrid natively).

Native hybrid search functions (the 2025 productization trend)¶

The 2025-09-30 MongoDB post names the industry-level convergence toward native hybrid-search primitives — database / search engine APIs that handle fusion internally rather than leaving score combination to application code. Examples: MongoDB Atlas Hybrid Search (2025), Elasticsearch rrf retriever (8.8+), OpenSearch hybrid query, Weaviate hybrid operator, Qdrant hybrid queries, Pinecone sparse-dense. MongoDB's position: "Solutions with hybrid search functions handle the combination of lexical and vector search natively, removing the need for developers to manually implement it. This reduces development complexity, minimizes potential errors, and ensures that result merging and ranking are optimized by default."

Re-ranking (the layer above hybrid)¶

Hybrid retrieval returns a candidate set; re-ranking refines ordering on top. MongoDB names "cross-encoders, learning-to-rank models, and dynamic scoring profiles" as the emerging techniques. Re-ranking is not a replacement for hybrid retrieval — it sits on top, re-scoring the top-K candidates with more expensive but higher-quality models. Typical pipeline: hybrid retrieval → cross-encoder rerank → top-N to consumer.

Composition with knowledge graphs¶

Dash layers a concepts/knowledge-graph on top of the hybrid index rather than replacing either lexical or vector component. The graph's "knowledge bundle" summaries are themselves re-ingested through the hybrid index pipeline (both BM25 and vector), so graph signals ride on the same retrieval surface rather than becoming a separate third query path. This keeps runtime retrieval a single fused lookup instead of three independent ones.

Tradeoffs¶

Two indexes to maintain — double the ingestion + freshness plumbing.
Two indexes to size — vector stores scale differently from BM25 (memory-bound, dimensional), so capacity planning is separate.
Ranking complexity. Fusion weights + re-ranking become hyperparameters; offline eval against NDCG-style metrics becomes a first-class concern.
Embedding-drift blast radius. Changing embedding models requires re-indexing the entire corpus; BM25 does not have this problem. Separate versioning.

Seen in¶

sources/2026-01-28-dropbox-knowledge-graphs-mcp-dspy-dash — Dash explicitly running BM25 + dense vectors as a hybrid index; knowledge-graph-derived "bundles" flowing through the same hybrid pipeline; BM25 as "amazing workhorse".
sources/2026-04-21-figma-the-infrastructure-behind-ai-search-in-figma — Figma AI Search runs two independent OpenSearch indexes (one lexical / fuzzy-match over component names and descriptions, one k-NN over CLIP embeddings) queried simultaneously; scores combined via **min-max normalization per index + exact-lexical-match boost
interleave (patterns/hybrid-lexical-vector-interleaving). Worked example: "mouse" returns the icon titled "Mouse" and cursor-adjacent icons. Preserves existing lexical behaviour while adding semantic recall — migration-safe hybrid rollout shape.
sources/2025-09-30-mongodb-top-considerations-when-choosing-a-hybrid-search-solution — MongoDB's 2025-09-30 industry-evolution survey and buyer's guide. Names the 2022–2023 inflection when pure-vector retrieval proved insufficient; identifies RRF and RSF as the two standard fusion techniques; taxonomizes vendors as lexical-first vs vector-first; positions sparse vectors as vector-first platforms' bridging primitive to lexical; identifies the industry-level 2025 convergence on native hybrid-search functions (MongoDB Atlas's own release is one named instance, realized as systems/atlas-hybrid-search); names cross-encoders, learning-to-rank, and dynamic scoring profiles as the emerging re-ranking layer above hybrid retrieval.
sources/2026-04-16-cloudflare-ai-search-the-search-primitive-for-your-agents — Cloudflare AI Search promotes hybrid retrieval to a managed, runtime-provisioned primitive: vector + BM25 in parallel with fusion as an instance-level config (index_method, fusion_method: "rrf" | "max", reranking: true with @cf/baai/bge-reranker-base), plus per-document metadata boost at query time (concepts/metadata-boost) and cross-instance fan-out as composable layers on top. The 2026-04-16 worked example — "ERR_CONNECTION_REFUSED timeout" — is the canonical 2026 illustration of why both engines are needed (vector for paraphrase, BM25 for exact tokens), and the two-tokenizer config (porter for natural language, trigram for code) is the first-class productisation of the BM25 content-type-awareness knob.

systems/bm25 — the lexical side.
systems/dash-search-index — the hybrid-in-production instance (learned-ranker fusion).
systems/figma-ai-search — hybrid-in-production instance (min-max + exact-match-boost + interleave fusion).
systems/atlas-hybrid-search — MongoDB's native hybrid-search function (separate-indexes-unified-query-API realization).
systems/atlas-vector-search — the vector side of MongoDB's hybrid stack.
concepts/reciprocal-rank-fusion — one of the two standard fusion algorithms.
concepts/relative-score-fusion — the other standard fusion algorithm.
concepts/sparse-vector — vector-first platforms' bridging primitive to lexical.
concepts/cross-encoder-reranking — the layer above hybrid retrieval.
patterns/hybrid-lexical-vector-interleaving — the specific score-fusion mechanism Figma uses.
patterns/separate-vs-combined-index — the architectural trade-off this retrieval stack navigates.
patterns/native-hybrid-search-function — the 2025 productization pattern.
concepts/knowledge-graph — additional ranking signal layered on top of the hybrid retrieval output.
patterns/precomputed-relevance-graph — how the graph signals are prepared for the hybrid pipeline.
concepts/federated-vs-indexed-retrieval — the architectural choice that makes hybrid retrieval possible (indexed only).
sources/2026-04-21-meta-modernizing-facebook-groups-search — Meta Groups Scoped Search — scoped-community-search canonical instance. Meta re-architects the discussions module on Facebook Search from pure keyword retrieval onto a hybrid pipeline running Unicorn (sparse lexical, high precision for proper nouns + specific quotes) and SSR (12-layer 200M-param encoder) + Faiss ANN (dense semantic) as decoupled parallel pipelines sharing a preprocessing stage. Fusion at the L2 MTML ranker with TF-IDF / BM25 lexical features + cosine-similarity semantic features. First social-network scoped-search hybrid-retrieval instance on the wiki — complements enterprise/dev search instances (Dash, Figma AI Search, Atlas, Cloudflare AI Search).