CONCEPT Cited by 1 source
Multi-task retrieval scoring¶
Definition¶
Multi-task retrieval scoring is the practice — first canonicalised on the wiki via Meta's 2026-05-26 SilverTorch post (Source: sources/2026-05-26-meta-silvertorch-index-as-model-a-new-retrieval-paradigm-for-recommendation-systems) — of applying multi-objective composite scoring to candidates inside the retrieval forward pass, rather than deferring multi-task ranking to a downstream stage.
"SilverTorch also makes retrieval natively multi-objective. A scoring layer combines predictions for different user actions into a single composite score, so retrieval is no longer optimizing around one coarse similarity signal. Instead, it can evaluate a broad candidate pool against a richer notion of user engagement before late-stage ranking begins."
Why this is structurally new¶
In the canonical retrieval → ranking funnel, retrieval is "a fast pruning step" scored mostly by two-tower dot-product similarity, with multi-task multi-label scoring done only at the MTML L2 ranker. The retrieval-stage compute budget is too tight for richer scoring under the microservice-mesh architecture that drove the funnel design.
Two structural changes in Index as Model make multi-task scoring affordable inside retrieval:
- Item representations and cross-features remain in GPU memory between ANN search and scoring — no service hop, no serialization, no fan-out cost between the embedding-similarity stage and the scoring stage.
- The retrieval candidate pool can be one-to-two orders of magnitude wider because the same forward pass that retrieves can score more candidates without leaving the GPU. SilverTorch's fused Int8 ANN returns "hundreds of thousands" of candidates (vs Faiss-GPU's 2,048 ceiling).
The composite is then explicit (verbatim): "can evaluate a broad candidate pool against a richer notion of user engagement before late-stage ranking begins. The result is a wider funnel with more intelligence inside it — more candidates survive early retrieval, and they are screened by more sophisticated, multi-objective scoring before being passed to the final ranking."
Composite-score shape¶
The post names the concrete signal: combining predictions for like / share / comment into a single composite score in retrieval. This is structurally the same multi-task-multi-label shape as MTML rankers (Meta Friend Bubbles, Meta Adaptive Ranking Model, Meta Groups Search), but applied at retrieval-pool scope rather than at final-rank scope.
Mathematically: instead of score = dot_product(user, item), retrieval scoring becomes score = f(P(like | u, i), P(share | u, i), P(comment | u, i), ...) for some learned f, evaluated on the candidate pool surviving ANN.
Relationship to neural reranking¶
The post pairs multi-task scoring with neural reranking as the two retrieval-quality unlocks:
- Neural reranking — "multi-layer perceptrons, stacked self-attention, or more structured interaction models such as mixture of logits" applied to retrieval candidates, producing a richer relevance score than dot-product similarity.
- Multi-task scoring — composite of multiple engagement-action probabilities into one score (this concept).
Both require GPU-resident item representations + cross-features, both run inside the retrieval forward pass, both contribute to the "widened funnel" characterisation.
Relationship to existing wiki multi-task material¶
- concepts/multi-task-learning is the general training-paradigm concept — shared trunk + task-specific heads, MMoE, PLE.
- concepts/multi-task-multi-label-ranking is the L2-ranker instance — one ranker scoring multiple engagement objectives jointly.
- This page (multi-task retrieval scoring) is the L1-retrieval instance — same multi-objective principle applied at the retrieval-stage altitude, made affordable by concepts/index-as-model.
- patterns/auxiliary-engagement-task-for-conversion-retrieval (Pinterest 2026-04-27) is a sibling pattern from a different company — using auxiliary engagement-task signals to improve sparse-conversion retrieval via the same multi-task-in-retrieval strategy, motivated by sparse labels rather than by GPU-substrate consolidation.
Caveats¶
- The post discloses the shape (composite of like / share / comment predictions) but not the functional form of the composite (linear weights / learned MLP / ratio-based / Pareto).
- Whether the multi-task scoring layer is trained jointly with the user / item towers, or applied as a downstream lightweight head, is not detailed — both are common in MTML ranking.
- The exact candidate-pool size at which multi-task scoring is applied (just the top-N from ANN? all hundreds-of-thousands?) is not specified.
Seen in¶
Related¶
- systems/silvertorch · systems/pytorch
- concepts/multi-task-learning · concepts/multi-task-multi-label-ranking · concepts/index-as-model · concepts/retrieval-ranking-funnel · concepts/two-tower-architecture · concepts/ann-index
- patterns/unified-pytorch-model-as-retrieval-system · patterns/auxiliary-engagement-task-for-conversion-retrieval
- companies/meta