PATTERN Cited by 1 source
Hybrid lexical+vector interleaving (min-max normalized, exact-match boosted)
Definition¶
Hybrid lexical+vector interleaving is a specific score-fusion tactic for combining results from two independent search indexes (one lexical / BM25-or-fuzzy-match, one dense-vector / semantic) into a single ranked result list:
- Query both indexes in parallel.
- Collect top-K results from each, with their native scores (which are not directly comparable across the two indexes).
- Apply min-max normalization per index, remapping each score
list to
[0, 1]. - Boost exact lexical matches — give them a score bonus so they rank above semantic neighbours of equal raw relevance.
- Interleave results by updated score, de-duplicating on canonical item ID.
The defining feature: two independently-maintained indexes, per-index re-scoring, lexical-exact-match as a hard signal. This is distinct from the general hybrid retrieval idea (which just says "combine both signals"); this pattern pins down how the scores get combined.
(Source: sources/2026-04-21-figma-the-infrastructure-behind-ai-search-in-figma)
Intent¶
Three forces that push on hybrid search:
- Exact matches must win. A user searching
"Mouse"who has a component literally named "Mouse" expects it in the top result. A pure-vector ranker can demote it below thematically-similar cursor icons if their embedding happens to cluster tighter. The pattern encodes "exact lexical match trumps semantic similarity of the same magnitude" as a direct product rule, not an emergent ranker property. - Semantic neighbours still matter. The same user wants the cursor-adjacent icons too — just below "Mouse". So the pattern can't be "lexical first, vector fallback"; it has to combine, with lexical bonused.
- Cross-index scores aren't comparable. BM25 scores and vector cosine distances live on entirely different scales. You can't just sum or weighted-average the raw numbers and expect sensible results. Normalization is the pre-condition.
Mechanism¶
query_q = user_query
lexical_hits = lexical_index.search(query_q, topK=K)
vector_hits = vector_index.search(embed(query_q), topK=K)
def minmax(hits):
s = [h.score for h in hits]
lo, hi = min(s), max(s)
for h in hits:
h.norm_score = (h.score - lo) / max(hi - lo, eps)
minmax(lexical_hits)
minmax(vector_hits)
# exact-match boost applied to lexical-side results only
for h in lexical_hits:
if is_exact_match(h.title, query_q):
h.norm_score += EXACT_MATCH_BOOST # small positive constant
combined = []
seen = set()
for h in sorted(lexical_hits + vector_hits, key=lambda x: -x.norm_score):
if h.item_id in seen: continue
seen.add(h.item_id)
combined.append(h)
The actual interleaving/fusion step is the sort-by-norm-score over the union of the two lists, de-duplicating on item ID. Different implementations will diverge on:
- What counts as "exact match" (full-string-equals? case- insensitive? tokenized-substring?).
- Boost size (small positive constant? multiplier?).
- Top-K on each side (same? weighted by expected precision?).
Figma AI Search instance¶
"Searches are performed simultaneously against both the lexical index and the new embeddings index. Since raw scores from the independent OpenSearch indexes are not directly comparable, the result sets are assigned new scores based on a min-max normalization, with exact lexical matches receiving a boost. Results are then interleaved according to their updated scores.
Now a query returns not just the lexical results, but also appropriate results based on a semantic understanding as well. For example, "mouse" returns not just an icon that is specifically titled "Mouse" but also cursor-adjacent icons."
Context: lexical search (fuzzy string matching over component names and descriptions) predated AI-powered search in Figma Assets search. The hybrid ranker let the team safely roll out the new semantic index without regressing the existing behaviour — the pattern is a migration-compatible retrieval shape, not just a ranking policy.
Why min-max over reciprocal rank fusion (RRF)?¶
RRF — score each item as sum(1 / (k + rank_in_list)) — is a widely-
cited alternative. Trade-offs:
| Min-max normalization | RRF | |
|---|---|---|
| What's preserved | The relative magnitudes within a list | Only the ranks, not the scores |
| Exact-match boost | Easy — add to norm score | Awkward — ranks don't carry a magnitude to boost |
| Sensitivity to tail | Sensitive (outlier tops affect normalization) | Robust |
| Simplicity | Very simple | Very simple |
| Tunable | One boost constant | One k parameter |
Figma's choice of min-max + boost is explicit in the post. RRF would be a reasonable alternative for a product without the "exact lexical match must win" rule.
Compared to pure-vector retrieval with lexical fallback¶
| Hybrid interleave (this pattern) | Pure vector + lexical fallback | |
|---|---|---|
| Exact match | Always surfaces in top | Only surfaces if vectors fail |
| Implementation | Two indexes, fusion step | One index, conditional re-query |
| Latency | Two parallel queries + merge | One query + possibly second |
| Migration safety | Preserves lexical behaviour | Degrades lexical behaviour |
For a product rolling out AI search on top of an existing lexical search users already rely on, the hybrid pattern is the safer migration.
Comparison to Dropbox Dash¶
Dash also runs hybrid BM25 + vector retrieval (sources/2026-01-28-dropbox-knowledge-graphs-mcp-dspy-dash), with its own fusion / reranker layer on top (Dash uses a learned ranker over multiple ranking passes including BM25, vector, knowledge-graph edges, ACL, personalisation). The retrieval-pattern-shape match, but Dash's post is qualitative on fusion ("we found BM25 was very effective on its own with some relevant signals … hybrid retrieval"); Figma's post gives the specific min-max-normalization + exact-match- boost mechanism. Both are canonical instances from different angles.
When to apply¶
- Existing production lexical search + new vector search being rolled out; preserving lexical behaviour is required.
- Product where "exact match must win" is a hard rule (most naming / title / component-library retrieval).
- Small-to-moderate corpus where maintaining two indexes is acceptable.
- OpenSearch-shaped stack where both indexes live in the same cluster / service and parallel querying is cheap.
Don't apply when:
- You have only one signal (purely semantic corpus, or purely lexical).
- A learned fusion model (LambdaRank, learned-to-rank, deep ranker) is available and benched better — then the learned ranker replaces the min-max + boost fusion.
- Latency budget can't absorb two parallel queries + merge.
Caveats¶
- "Exact match" ambiguity. Fuzzy-lexical-index scores themselves are approximate; deciding which result is an "exact" match needs a separate check (title equality, ID equality) that can't just use the lexical score threshold.
- De-duplication is load-bearing. The same item can surface from both indexes — the union must dedupe on canonical ID or the top-K gets crowded with twins.
- Min-max is sensitive to outliers. A single high-score outlier compresses the rest of the normalised scores. Clipping or log-normalization are common defences.
- Boost constant is a hyperparameter. Needs tuning against the eval set; too small and exact matches lose; too large and semantic neighbours get buried.
- Queries that only one index serves well. A pure typo query (lexical strong, semantic weak) or a pure paraphrase query (semantic strong, lexical weak) can still be handled, because both lists contribute; but the normalization window per list means the weak list's "best" still scores well on its own list. Usually acceptable; occasionally surprising.
Related¶
- concepts/hybrid-retrieval-bm25-vectors — the general concept; this pattern is a specific realisation.
- systems/figma-ai-search — canonical instance.
- systems/amazon-opensearch-service — the two-indexes-in-one- cluster substrate Figma uses.
- systems/dash-search-index — parallel hybrid instance at Dropbox.
Seen in¶
- sources/2026-04-21-figma-the-infrastructure-behind-ai-search-in-figma — min-max normalization per index + exact-lexical-match boost + interleave; preserves existing lexical behaviour for safe rollout of the new vector index.