Skip to content

CONCEPT Cited by 1 source

Relative Score Fusion (RSF)

Definition

Relative Score Fusion (RSF) is a score-fusion technique for combining result lists from multiple independent retrieval methods into a single ranked list, using the raw scores from each retriever directly — after per-retriever normalization to put the scales on common ground. A typical form:

$$ \text{RSF}(d) = \sum_{r \in R} w_r \cdot \text{norm}_r(\text{score}_r(d)) $$

where $w_r$ is a per-retriever weight (tunable), and $\text{norm}_r$ is a per-retriever normalization (min-max, z-score, or sigmoid) that maps that retriever's scores to a shared target range (typically [0, 1]).

MongoDB's 2025-09-30 framing:

"[RSF] works directly with raw scores from different sources of relevance, using normalization to minimize outliers and align modalities effectively at a more granular level than rank alone can provide."

— (MongoDB, 2025-09-30)

Why score-based instead of rank-based

Raw scores, when properly calibrated, encode confidence magnitude — information that pure rank-based methods like RRF discard:

  • A BM25 score of 42.7 carries more information than "rank #1" — it quantifies how strong the keyword match actually is.
  • A cosine-similarity score of 0.97 vs 0.62 at the same rank position tells you how confident the embedding model is.
  • A retriever returning 10 documents all with scores near 0.01 is signalling low relevance across the board — rank alone erases this.

RSF keeps that magnitude and lets the fusion score reflect both cross-retriever consensus and per-retriever confidence.

The normalization problem

The tradeoff vs RRF is that raw scores across retrievers are incomparable by default — fixing that is the engineering work:

Normalization What it does Good at Weakness
Min-max (s - min) / (max - min) per retriever's result set Simple, bounded [0,1] Sensitive to outliers; a single extreme-high score compresses the rest
Z-score (s - μ) / σ per retriever Outlier-resistant Doesn't bound the output; negative values
Sigmoid / tanh 1 / (1 + exp(-s)) with per-retriever scale Bounded + smooth Requires per-retriever scale calibration
Quantile / percentile Map raw score to rank-percentile Robust to any distribution Loses absolute-magnitude info (approaches RRF)

MongoDB's framing — "to minimize outliers and align modalities effectively at a more granular level than rank alone" — points at the middle-ground: normalizations that resist outliers (unlike naive min-max) but preserve magnitude information (unlike quantile/RRF).

Properties

  • Preserves score-magnitude information. Good for retrievers with calibrated scores (BM25 is calibrated-enough in practice; cosine similarity is bounded; vector dot products often are too).
  • Weighted — expressive. Per-retriever weights let you bias toward one modality (e.g. α = 0.7 toward vector search when the corpus is paraphrase-heavy).
  • Tunable per-workload. Offline eval with NDCG gives a principled way to tune weights and normalization choice.

Limitations

  • Normalization is a hyperparameter. Wrong choice (min-max vs z-score vs sigmoid) distorts fusion — choice matters and doesn't have a universal default.
  • Outlier-sensitive without careful design. Min-max in particular fails when one retriever emits a score blowout — one result dominates.
  • Requires calibration effort. If one retriever's scores drift (embedding model update, BM25 parameter retune), the fusion needs re-tuning.
  • More hyperparameters than RRF. RRF has one knob (k); RSF has per-retriever weight + normalization method + (often) per-retriever α.

Where it's used

  • MongoDB Atlas's native hybrid search function. RSF is the second of the two canonical fusion techniques MongoDB's 2025-09-30 post identifies alongside RRF: "Both approaches quickly gained traction and have become standard techniques in the market."
  • Weaviate's hybrid search (named "relativeScoreFusion" in their API) — the default historically was RRF; RSF is a first-class option since 1.20.
  • Many custom production stacksad-hoc weighted-sum fusion with min-max normalization is a common DIY shape that is, by MongoDB's taxonomy, an RSF variant.
  • Figma's score-fusion — min-max per index + exact-match boost + interleave; the boost-plus-normalize shape is a specific RSF realization sitting beside generic RSF in the design space.

When to pick RSF over RRF

  • Retrievers with well-calibrated raw scores — cosine similarity, Figma-style min-max-normalized lexical + vector, learned retrievers trained with probability outputs.
  • Workloads where fine-grained rank differentiation matters — user-facing search where you need to rank item 4 vs item 5 correctly, not just "both in top 10".
  • When one modality is measurably stronger — the weighted-sum form makes the bias explicit and tunable.

RSF loses to RRF on: new retrievers that haven't been score-calibrated, heterogeneous retriever mixes (BM25 + vector + rules + LTR), and teams without eval infrastructure to tune the weights.

Seen in

Last updated · 200 distilled / 1,178 read