CONCEPT Cited by 1 source

Similarity-tier retrieval¶

Similarity-tier retrieval names the product + eval constraint that a search feature must return high-quality results at multiple similarity levels simultaneously — exact matches, close variants, and broad/diverse results — because users start from something close and expand outward when they want diverse exploration, not the other way round.

The load-bearing implication is on the evaluation bar: showing "good diverse results" is not enough if the "find the exact thing" case fails — users stop trusting the feature and don't use it for exploration either.

Core framing¶

From Figma AI Search (Source: sources/2026-04-21-figma-how-we-built-ai-powered-search-in-figma):

"To address the full spectrum of designer needs, our solution had to deliver relevant results across the board—from exact matches to highly similar results to somewhat different options. The feature wouldn't cut it unless we could serve up high-quality results at different similarity levels."

"Our research showed that users prefer to start with something closer or more similar, even when ultimately seeking diverse results. In other words, if we couldn't prove we could find the needle in the haystack, designers wouldn't trust the feature for broader exploration."

The second quote is the critical one: close-match quality is the pre-condition for diverse-match trust, not an independent dimension.

The similarity tiers¶

Three named tiers from Figma's framing:

Tier	Description	Example query
Exact	The specific item the user knows exists	`"project [codename] theme picker"`
Near-similar	Variants of a known concept	`"checkout screen"` (find any existing checkout screen)
Broad / diverse	Inspiration across a theme	`"red website with green squiggly lines"` (find stylistically-related work)

A feature that aces tier 3 but flunks tier 1 fails the user's trust threshold. A feature that aces tier 1 but flunks tier 3 is still useful for creation — and users will then risk tier 3 because they trust the system.

Implications for evaluation¶

An eval suite must cover all tiers with distinct query distributions. Shipping a system that hits 85% NDCG on representative queries is not the same as hitting 85% on each tier separately. In particular:

Exact-match queries (name / codename / acronym) usually have a single correct answer; NDCG collapses to "is the right item at position 1."
Near-similar queries expect a short ranked list of equally- valid candidates.
Broad / diverse queries expect coverage / variety — classic diversity metrics (α-NDCG, MMR-weighted) come in; top-1 quality matters less.

Reporting an aggregate metric without the tier breakdown hides trust-critical regressions. Figma's eval tool (patterns/visual-eval-grading-canvas) is designed to work across all three tiers — the labelers see results across the spectrum and mark correct/incorrect on each.

See concepts/relevance-labeling for the labeling axis that feeds these evals.

Implications for retrieval architecture¶

Similarity-tier retrieval shapes retrieval-stack design:

Multi-tier candidate generation. Hybrid retrieval (concepts/hybrid-retrieval-bm25-vectors) gives different tiers different candidate-generation paths — BM25 / exact-string for tier 1; dense vectors for tier 2; vector + MMR / clustering for tier 3. One ranker fusing all three.
Ranker loss function. If training labels are graded (concepts/relevance-labeling), a listwise loss (NDCG-based) that weights top-1 heavily protects tier-1 trust.
Query-intent detection vs tier-agnostic ranking. Tempting to classify the query by tier and route differently; Figma rejected this kind of mode-detection ("designers oscillate between modes, so we decided Figma would offer a range of results and let users pick what most fits their needs"). The UI offers filters instead — the back-end retrieval is tier- agnostic + over-retrieves across tiers, and the user filters.

Implications for UX¶

Unified refinement interface. Rather than a tabbed tier- switcher ("Exact | Similar | Inspire"), Figma ships a single results view with metadata filters (created-by, file, recency). The tiers are present in the result set, not in the navigation.
Start narrow, allow expand. Make the close matches visible first (trust the needle-in-haystack case), let the user scroll / filter for broader.

Relationship to other concepts¶

concepts/relevance-labeling — the labeling axis. Similarity-tier retrieval forces labels to cover tier distributions; otherwise the model overfits whichever tier the seed labels emphasise.
concepts/ndcg — the scoring metric. Report per-tier NDCG, not aggregate, for trust-critical use cases.
concepts/vector-similarity-search — the underlying mechanism. ANN's tier-by-distance output maps naturally to tiers 1→3, though calibration of "how far is tier-2 vs tier-3" varies by embedding model + corpus.
concepts/hybrid-retrieval-bm25-vectors — the mechanism that covers tier 1 (lexical) + tiers 2/3 (vector) in one retrieval stack.

Caveats¶

Tier labels are product-specific. Figma's three tiers ("frame lookup / frame variations / broad inspiration") are one taxonomy; other domains (code search, doc search, product search) will have different tier structures. The abstraction — deliver across tiers; close-match gates diverse-match trust — is the transferable part.
Trust threshold is not measurable in isolation. "Users wouldn't trust the feature" is a product / retention claim that shows up in adoption metrics, not in offline evals. Ship-gate on aggregate metrics understates the real threshold.
Over-indexing on tier-1 can starve tier-3. If every trade-off favours exact-match quality, diverse-exploration quality regresses even when tier-1 is fine. Watch all tiers as a dashboard, not a one-shot ship gate.

Seen in¶

sources/2026-04-21-figma-how-we-built-ai-powered-search-in-figma — canonical framing: "if we couldn't prove we could find the needle in the haystack, designers wouldn't trust the feature for broader exploration." Three tiers (frame lookup / frame variations / broad inspiration); users start close, expand outward.

systems/figma-ai-search — canonical product instance.
concepts/relevance-labeling — labels must cover tiers.
concepts/vector-similarity-search — retrieval mechanism.
concepts/hybrid-retrieval-bm25-vectors — tier-spanning retrieval architecture.
concepts/ndcg — per-tier metric.