Skip to content

PATTERN Cited by 1 source

Conditional-probability ranking objective

Pattern

Integrate a new ranking signal into an existing multi-objective ranker by adding a conditional-probability term P(outcome | new condition) with a tunable weight, rather than retraining the whole formula from scratch. The existing weighted combination of engagement objectives stays; the new signal enters as one more term the ranker can learn to balance.

Structure:

existing_formula = w1 · P(watch)
                 + w2 · P(like)
                 + w3 · P(comment)
                 + ...

new_formula      = existing_formula
                 + w_bubble · P(video engagement | bubble impression)
                              └────────── new conditional-probability term ──────────┘

The new term is:

  • Conditional on a new observable event (bubble impression, notification tap, badge shown, recommendation card surfaced).
  • Probability of engagement given that condition — the quantity that specifically captures the value of the new condition.
  • Weighted tunably against the existing objectives — lets operators balance new-signal importance against existing objectives without re-tuning every weight.

When to use

  • You have an existing multi-objective ranker that's well-tuned for current engagement.
  • You're adding a new signal (friend-bubble impression, new surface, new user action) that has its own value proposition — messaging friends, social discovery, a new content family.
  • You want the new signal to shape the ranker's behaviour but not dominate it — existing objectives must still matter.
  • You have a mechanism to measure the conditional probability — bubble impressions are observable, downstream engagement is observable.

Why conditional, not marginal

P(video engagement | bubble impression) — conditional — captures "given that we showed a bubble, how likely is the user to engage with the video." This is the quantity that matters for the new signal's value.

P(video engagement) — marginal — is already (implicitly) captured by existing engagement objectives. Adding it again would double-count.

P(bubble interaction) — different outcome variable — measures bubble engagement, not video engagement. The Meta Friend Bubbles post is explicit that the intent is to drive video engagement, with bubbles as the social-context signal.

Choosing the conditional probability sharpens the objective to exactly the new-signal's contribution.

Canonical wiki reference

Meta Friend Bubbles (sources/2026-03-18-meta-friend-bubbles-enhancing-social-discovery-on-facebook-reels):

"We augmented our existing video-ranking formula, which includes several optimization goals, with a friend-bubble ranking objective designed to maximize overall video engagement. We consider interaction metrics such as watch time, comments and likes, and use a conditional probability term, P(video engagement | bubble impression), to predict the likelihood that a user will engage with a video after seeing a friend bubble."

"This is balanced with tunable weights that manage trade-offs between social interaction and video engagement, allowing us to optimize for social connection (helping people discover videos their friends like) and content quality."

Key properties of the Meta instance:

  1. Existing formula preserved. Watch time, comments, likes — the existing objectives stay.
  2. New term added, not substituted. P(engage | bubble) enters as a new term.
  3. Tunable weights. The trade-off between "social interaction and video engagement" is explicit and operator-controlled.
  4. Dual optimisation framing. Social connection + content quality are framed as orthogonal axes both worth optimising, not competing goals.

Composing with MTML

In an MTML ranker, the conditional-probability term is naturally a new task head:

  • Existing heads predict P(watch), P(like), P(comment), …
  • New head predicts P(engage | bubble impression) — trained on examples where a bubble was shown and engagement did or did not follow.
  • The ranker scores candidates using a weighted sum of head outputs, and the new head's weight is the tuning surface for mixing the new signal.

The pattern is therefore architecture-compatible with MTML: adding a new term doesn't require a new model, just a new head + a new weight. This is a significant architectural win — most new signals can be absorbed without rewriting the ranker.

Extension: paired with a feedback loop

Meta pairs this pattern with a continuous feedback loop (patterns/closed-feedback-loop-ai-features): bubble-impression → engagement data flows back into training, so the conditional-probability head keeps learning which friend-content combinations actually drive engagement. Without the feedback loop, the head's prediction quality degrades as content distribution shifts; with it, the head adapts continuously.

Caveats

  • Selection bias in the conditioning event. The conditional is only observable when bubbles are shown. If bubbles are shown preferentially on high-engagement videos (because the retriever selected them for closeness), the training distribution is biased. Need counterfactual-evaluation or IPW-style (AIPW) debiasing.
  • Weight tuning is live. w_bubble changes product behaviour directly; tuning requires A/B infrastructure + care for second-order effects (ecosystem health, creator incentives).
  • Can incentivise wrong thing. Optimising P(engage | bubble) rewards showing bubbles on videos most likely to engage conditional on bubble — which can overlap with videos users would engage with anyway. Counterfactual lift measurement is needed.
  • Interaction with other objectives. Adding the new term can shift weights on existing terms in non-obvious ways; full re-tuning after adding a major new term is sometimes necessary.
  • Implementation-level opacity. Meta names the term but does not disclose the weight, the exact definition of "engagement," the training-data size, or the feedback-loop cadence.

Anti-patterns

  • Absorb new signal into existing engagement head. Collapses the new-signal contribution into existing-signal training data; no way to isolate + tune.
  • Create a new ranker just for the new signal. Duplicates infrastructure; candidate scoring becomes inconsistent across rankers.
  • Replace existing formula entirely. Throws out existing tuning; usually regresses main-line metrics.
  • Set w_bubble without an A/B test. Direct production deployment of a new weight usually has unexpected second-order effects.

Seen in

Last updated · 319 distilled / 1,201 read