CONCEPT Cited by 1 source
Surface-specific calibration¶
Definition¶
Surface-specific calibration is the use of separate calibration layers for each traffic segment (product surface, view type, user segment, country, device class) in a unified CTR prediction / ranking model, rather than a single shared calibration head over the combined distribution.
The underlying problem: a shared calibration layer implicitly mixes traffic distributions across segments — it's trained on the joint distribution of all surfaces' traffic and systematically mis-calibrates each individual sub-distribution. Different surfaces have different base CTRs, different feature availability, different user-intent priors; a single logistic/isotonic calibration head cannot accurately map logits to empirical rates across all of them simultaneously.
Architectural shape¶
[ unified model trunk + surface-specific tower trees ]
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
HF logits SR logits RP logits
│ │ │
HF calibration SR calibration RP calibration
│ │ │
HF CTR SR CTR RP CTR
Each surface's logits flow through a view-type-specific calibration layer (Platt scaling, isotonic regression, or a learned calibration head) trained on that surface's traffic distribution. The shared trunk provides representation; the surface-specific calibration ensures the final probabilities are correctly calibrated within each surface's distribution.
Canonical wiki instance¶
Pinterest's unified ads engagement model uses surface-specific calibration between Home Feed and Search traffic (Source: sources/2026-03-03-pinterest-unifying-ads-engagement-modeling-across-pinterest-surfaces):
"Since the unified model serves both HF and SR traffic, calibration is critical for CTR prediction. We found that a single global calibration layer could be suboptimal because it implicitly mixes traffic distributions across surfaces. To address this, we introduced a view type specific calibration layer, which calibrates HF and SR traffic separately. Online experiments showed this approach improved performance compared to the original shared calibration."
The architectural move: the trunk can be shared, but the calibration must be split. This is the refinement that lets Pinterest's unified model keep the benefits of shared representation while avoiding the cross-surface miscalibration tax.
Why surface-specific calibration beats shared calibration¶
- Base-rate differences. Home Feed and Search have different base CTRs because of different user intent (browsing vs searching). A shared calibration layer averages these and mis-calibrates both.
- Feature availability differences. Each surface exposes different features (Search has query tokens, HF doesn't; RP has context-Pin embedding). The mapping from logit to calibrated probability should reflect what information is actually in the features for that surface.
- Distributional drift between surfaces is not time-varying drift. A single calibration head trained continuously will track time drift but can't decompose time vs surface simultaneously without explicit segmentation.
Generalisations¶
The pattern applies beyond surfaces:
- Per-user-segment calibration — separate calibrations for new users vs long-tenured users.
- Per-device calibration — separate calibrations for iOS vs Android vs web.
- Per-country calibration — separate calibrations for high-ARPU vs emerging markets.
- Per-ad-format calibration — separate calibrations for image ads vs video ads vs carousels.
Any time a unified model serves heterogeneous traffic distributions, per-segment calibration is a narrow architectural refinement with measurable online wins.
Caveats¶
- Pinterest's specific calibration head architecture not disclosed — whether it's Platt scaling, isotonic, or a learned small MLP is not stated.
- Improvement magnitude not disclosed. "Online experiments showed this approach improved performance" — directional only.
- Cost of surface-specific calibration is not discussed — training multiple calibration heads adds minor training overhead; serving cost is essentially zero (one extra sigmoid/regression per request).
- Distinct from surface-specific tower trees — calibration is the last layer; tower trees are the feature-transformation subnetworks above the calibration head.
Seen in¶
- 2026-03-03 Pinterest — Unifying Ads Engagement Modeling (sources/2026-03-03-pinterest-unifying-ads-engagement-modeling-across-pinterest-surfaces) — canonical wiki instance: view-type-specific calibration layer separating HF from SR traffic in a unified CTR model, with explicit online-experiment validation.