PATTERN Cited by 1 source
Survey-trained closeness model¶
Pattern¶
Train an ML model for a latent relationship quantity (closeness, trust, real-world tie strength) against a refreshed survey label that asks users about that quantity directly, rather than against a proxy derived from platform activity. Use platform-activity features as inputs; use the survey answer as the label.
Structure:
Random sample of user-friend pairs
│
▼
┌───────────────────────────────────────────────┐
│ Lightweight binary survey │
│ "Do you feel close to this connection │
│ in real life?" + proxy questions │
│ (e.g. how often you communicate) │
└──────────────────┬────────────────────────────┘
│ labels (refreshed regularly)
▼
┌─────────────────────────────────┐
│ Binary-classification ML model │
│ features = │
│ social-graph (mutual friends, │
│ connection strength, │
│ interaction patterns) + │
│ user attributes (location, │
│ friend count, posts shared) │
└─────────────┬───────────────────┘
│
▼
precomputed closeness score
(e.g. weekly inference over
trillions of friend-pairs)
When to use¶
- The quantity you care about is a latent real-world relationship the platform can't observe directly (closeness, trust, real-life-friendship, mentor-mentee, professional-collaborator).
- Platform-activity proxies (message volume, interaction count, tagged-photo overlap) are available but biased — two close real-life friends might interact rarely on-platform; two heavy-interacting accounts might be weak ties.
- You can afford to ask — survey infrastructure exists, user base is large enough that a random sample yields a useful training set, user tolerance for occasional surveys is non-zero.
- You're OK with binary labels — a single close-vs-not-close question is easier to stabilise than a graded scale.
Why not proxies as labels¶
Using platform activity as a label (rather than a feature) induces optimise-to-proxy failure modes:
- Train on "frequency of interaction" as the label, and the model rewards interaction regardless of relationship quality.
- Train on "reacts to each other's posts" as the label, and the model penalises quiet-but-close relationships.
- Any proxy is a specific kind of interaction; models trained against it approximate that interaction, not closeness.
Surveys break the proxy cycle: the label encodes the actual latent variable ("close in real life"), and the platform-derived features are used only as inputs to predict it. The model can therefore learn which interaction patterns do correlate with real-life closeness and which don't.
Canonical wiki reference¶
Meta Friend Bubbles (sources/2026-03-18-meta-friend-bubbles-enhancing-social-discovery-on-facebook-reels, 2026-03-18):
"It is trained on a regular cadence using a lightweight binary survey in which a randomly selected group of Facebook users is asked whether they feel close to a specific connection in real life. The survey is structured as a close vs. not-close prediction problem, refreshed regularly to keep labels current, and includes questions that act as proxies for offline relationship strength (such as how often two people communicate). In production, the model runs weekly inference over trillions of person-to-person connections across Facebook friends."
Features named: mutual friends, connection strength, interaction patterns, user-provided location, number of friends, number of posts shared.
Production scale: weekly, trillions of pairs. The output — viewer-friend closeness scores — feeds both the retrieval stage (candidate sourcing by closeness threshold) and the ranking stage (as features in the MTML rankers) of the Friend Bubbles pipeline.
Implementation considerations¶
- Survey-design hygiene. Wording, ordering, response scale all affect label quality. Proxy questions "how often do you communicate" give the model additional signal but can also bias responders.
- Label freshness. Closeness drifts over time — people lose touch, form new close ties. Meta explicitly calls out "refreshed regularly to keep labels current." A one-time label set rots.
- Sample-selection bias. Surveyed users are those willing to respond; respondents differ systematically from non-respondents. Model training should weight or calibrate against this.
- Binary vs continuous labels. Binary is easier to stabilise but loses information. Meta chose binary for Friend Bubbles; trade-off acknowledged.
- Privacy / consent. Collecting survey data about relationships requires disclosure + consent. The model output (closeness over trillions of pairs) is also a privacy-sensitive derived signal — goes in the same governance bucket as other relationship inferences.
- Composing with a context-specific model. Meta pairs the survey-trained model with a second context-specific model trained on platform-interaction signals at the bubble surface specifically. The survey-trained model is the foundation; the context-specific model adapts to the specific deployment surface. How they compose (additive, re-ranker, gating) is not disclosed.
Anti-patterns¶
- Train only on platform activity. Collapses to optimise-to-proxy.
- Survey once, never refresh. Labels rot; drift between training distribution and serving distribution grows over time.
- Ask graded questions without calibration. A 1-5 scale is harder to stabilise than binary; cross-user calibration gaps dominate.
- Surface closeness scores to users without guardrails. Closeness inferences exposed to users — "here's how close we think you are to X" — create product + social risks disproportionate to the feature benefit.
Related patterns / concepts¶
- concepts/viewer-friend-closeness — the concept this pattern produces.
- patterns/human-calibrated-llm-labeling — the sibling pattern of using human labels to calibrate an LLM labeller.
- patterns/human-in-the-loop-quality-sampling — the sibling pattern of asking humans to evaluate outputs, rather than label inputs.
- systems/meta-friend-bubbles — canonical deployment.
Seen in¶
- sources/2026-03-18-meta-friend-bubbles-enhancing-social-discovery-on-facebook-reels — canonical Meta instance.