CONCEPT Cited by 2 sources

Multi-task multi-label (MTML) ranking¶

Definition¶

MTML ranking is the class of recommendation / search ranker architectures that share a candidate + context representation across a set of tasks (distinct prediction targets — watch, like, comment, share, follow, …) each with potentially multiple labels (binary / multi-class / graded), trained jointly with a combined loss.

Structurally an MTML ranker is typically:

      (user + candidate + context features)
                     │
                     ▼
           [ shared encoder / trunk ]
                     │
       ┌─────────────┼─────────────┐
       ▼             ▼             ▼
   head: watch   head: like    head: bubble-
    (logprob)   (binary)       conditioned
                               engagement
                                (P(y | bubble))

The shared trunk amortises feature-extraction cost across all tasks; task-specific heads specialise for each prediction. Variants — MMoE (Multi-gate Mixture of Experts), PLE (Progressive Layered Extraction), shared-bottom vs task-specific-bottom — differ in how shared representation flows to each head.

Why MTML instead of independent rankers¶

Cost. One forward pass over the shared encoder scores N tasks; independent rankers do N forward passes.
Shared regularisation. Auxiliary tasks regularise the main task; rare-task labels benefit from signals in common-task labels.
Consistent candidate scoring. All tasks score the same candidate with the same features — no representation drift across tasks.
Easy to add tasks. When a new signal class appears (e.g. bubble-conditioned engagement), it's a new head, not a new model.

Canonical wiki reference¶

Meta Friend Bubbles (sources/2026-03-18-meta-friend-bubbles-enhancing-social-discovery-on-facebook-reels) uses MTML at both early-stage and late-stage ranking in Facebook Reels:

"We integrated friend-bubble interaction signals as features and added new tasks into both early-stage and late-stage ranking multi-task, multi-label (MTML) models to incorporate viewer-friend relationship strength and to learn downstream engagement on videos with social bubbles."

Two architectural additions for Friend Bubbles:

New features. Viewer-friend closeness scores + bubble-interaction signals become input features.
New tasks. Bubble-conditioned engagement — P(video engagement | bubble impression) — becomes a new ranker head, and its output enters the augmented ranking formula via a tunable weight. See patterns/conditional-probability-ranking-objective.

A continuous feedback loop re-trains the MTML models on fresh bubble-interaction data, letting them keep learning "which friend-content combinations resonate with users."

Early-stage vs late-stage¶

The Meta post distinguishes "early-stage" and "late-stage" MTML models — a standard recommendation-ranking convention:

Early-stage ranking scores a larger candidate pool with a cheaper model, narrows to a smaller set.
Late-stage ranking scores the narrowed set with a more expensive MTML.

Both are MTML-shaped in Meta's Reels system. Both had the bubble features + tasks added. This matters because: a signal added only at late-stage cannot recover candidates the early-stage ranker already eliminated. Meta adds the signal at both stages so it propagates through the whole funnel.

Caveats¶

Topology is not described. The Meta post names "MTML" as a class — number of tasks, head composition, encoder depth, MMoE/PLE variant, gradient-balancing scheme are all undisclosed.
Task interference risk. Joint training across many heads can hurt main-task accuracy if gradients conflict — the known failure mode MMoE/PLE were designed to address. The Meta post does not discuss this.
Loss-weighting is a tuning surface. The w · P(engage | bubble) term sits alongside other tunable weights in the overall ranking formula; the tuning procedure (offline grid, Bayesian optimisation, adaptive weighting) is not specified.
Feature-engineering vs model-architecture tradeoff. Adding closeness as a feature is only part of the fix; Meta's insight is that friend-content is a different distribution that needs its own task to score correctly — pure feature addition would under-fit. This is the load-bearing architectural claim.

Seen in¶

sources/2026-03-18-meta-friend-bubbles-enhancing-social-discovery-on-facebook-reels — canonical; names MTML at both early + late stages with new tasks for bubble-conditioned engagement.
sources/2026-03-03-pinterest-unifying-ads-engagement-modeling-across-pinterest-surfaces — Pinterest ads engagement unification — MTML applied across product surfaces (rather than across task semantics). Pinterest's unified ads engagement model uses multi-task heads where the task axis is the surface itself (HF / SR / RP as separate tasks). Structurally the same MTML shape — shared trunk + task-specific heads — but with different task semantics: Meta uses MTML tasks for different engagement types within one surface (Reels: watch / like / bubble-conditioned engagement); Pinterest uses MTML tasks for the same engagement prediction (CTR) across different surfaces. Pinterest's variant adds surface-specific tower trees and surface-specific checkpoint exports as deployment-level specialisation mechanisms — structurally analogous to MMoE gating but at the tower/deployment granularity. Extends the MTML concept beyond single-surface multi-engagement to multi-surface single-engagement.
sources/2026-04-21-meta-modernizing-facebook-groups-search — Meta Groups Scoped Search L2 ranker — MTML applied to search ranking. Canonical wiki extension of MTML into the search-ranking domain (previously recommendation-only on the wiki). The L2 ranker merges candidates from parallel lexical (Unicorn) + semantic (SSR + Faiss) retrieval arms and jointly optimises for clicks + shares + comments as three engagement tasks. Input features are heterogeneous by retrieval paradigm: sparse lexical (TF-IDF, BM25) alongside dense semantic (cosine similarity) — the ranker is the fusion point where "merging results from two fundamentally different paradigms" happens. Third Meta-domain MTML instance on the wiki (Reels recommendation; Ads ranking; Groups scoped search) — reinforces MTML as Meta's default multi-objective ranker architecture across recsys + search + ads.

concepts/retrieval-ranking-funnel — MTML is the stage-2 ranker choice in recommendation instances.
concepts/viewer-friend-closeness — the upstream feature.
patterns/conditional-probability-ranking-objective — the mechanism for adding the new bubble task to the formula.
patterns/closed-feedback-loop-ai-features — the training-loop pattern that keeps the MTML models fresh.
systems/meta-friend-bubbles — canonical instance.
concepts/llm-based-ranker — the alternative stage-2 ranker family for non-recommendation domains.