Skip to content

CONCEPT Cited by 1 source

CTR prediction

Definition

CTR prediction (click-through rate prediction) is the machine-learning task of estimating the probability that a user will engage (click, like, watch, tap) with a specific candidate item (ad, content, link, Pin) in a specific context. It is the core scoring primitive under ads ranking and recommendation systems: given user, candidate, and context features, output a probability P(engage | user, candidate, context).

CTR prediction is typically trained on binary engagement labels (engaged / not engaged) at the impression-level, with log-loss or a calibrated classification objective. In multi-task / multi-label recsys architectures (see MTML ranking), CTR is often one of several task heads (CTR + like + share + follow, each with its own head) trained jointly.

Why it's load-bearing

The CTR score enters the final ranking formula and drives the auction mechanics in ads serving:

  • Ad ranking. Candidate ads are ranked by bid × predicted-CTR (or a more complex expected-value formula); higher-CTR ads win more auctions.
  • Calibration matters for the auction to be economically valid. Predicted probabilities must be close to empirical rates — overconfident CTR predictions bias budget spend to specific surfaces or ad formats and break revenue guarantees. See surface-specific calibration.
  • Serving cost is multiplied by candidate count. Every ad candidate needs a CTR score, so CTR models are evaluated O(candidates) times per request — the scaling axis behind optimisations like request-level user-embedding broadcasting and surface-specific tower trees.

Canonical wiki instance

Pinterest's unified ads engagement model predicts CTR across three ads surfaces (Home Feed, Search, Related Pins) with surface-specific calibration as the critical correctness mechanism (Source: sources/2026-03-03-pinterest-unifying-ads-engagement-modeling-across-pinterest-surfaces):

"Since the unified model serves both HF and SR traffic, calibration is critical for CTR prediction. We found that a single global calibration layer could be suboptimal because it implicitly mixes traffic distributions across surfaces."

Seen in

Last updated · 319 distilled / 1,201 read