Skip to content

CONCEPT Cited by 1 source

Real-time context feature

Definition

A real-time context feature is a feature whose value is determined at request time by the immediate session state — the page a user is currently viewing, the query they just typed, the song queued, the document open. Real-time context features are distinguished from offline-precomputable features (long-term user history, item embeddings, demographic vectors) by availability: they cannot be batched in advance because their value isn't determined until the request arrives.

Examples

  • Subject Pin at Pinterest: the Pin a user is currently viewing on Related Pins.
  • Search query: the query text on a search result page.
  • Currently-playing track: the song / video / podcast active in the user's session.
  • Currently-open product / article / ticket: the item driving a "related items" recommendation.
  • Cart contents: the items already in a user's shopping cart, on a checkout-adjacent recommendation.
  • Geolocation at request time: where the user is right now, vs their typical home location.
  • Session-derived intent: aggregated signals from the user's actions in the last N minutes / hour.

Architectural consequences

A model that wants to consume a real-time context feature is forced into a particular shape:

  1. The consuming component must run online. No offline batch can compute the feature; the part of the model that uses it must run at request time.
  2. Training-data availability is structurally hard. The feature exists at serving time but is rarely logged densely enough to use as training data. Mitigations include building real-context training pipelines (expensive) or synthetic pseudo-context augmentation (Pinterest's choice).
  3. Hybrid serving topologies emerge. When the model also has heavy offline-computable components (like a Transformer over long user history), the natural shape is hybrid offline/online inference — offline batch for the heavy historical encoder, online for the context-feature-consuming layer.

Why real-time context matters in retrieval, not just ranking

Real-time context has historically been a ranking-stage concern: the rankers see the current page / query / session and re-rank candidates accordingly. CGs (retrieval-stage components) often run with offline-only embeddings.

Pinterest's Contextual Sequential CG post documents the structural cost of this asymmetry: a CG without real-time context is systematically outcompeted in the downstream funnel by ranker-aligned alternatives. The CG retrieves candidates the ranker keeps dropping because the candidates don't reflect immediate intent.

The fix is to bring real-time context into the CG itself — typically via a context layer that consumes the request-time features and a hybrid inference split that lets the rest of the user tower stay precomputed.

Trade-offs

Cost vs latency

Online computation of real-time context features adds compute at request time. Acceptable for lightweight features (interest-category embeddings) but prohibitive for heavy features (full image encoding, full sequence rerunning) — those have to be precomputed upstream as offline embeddings that are then looked up at request time.

Freshness vs blast radius

Real-time context features can change every request, so the model behaviour can vary substantially within one session. This is desirable for relevance but also means a bug in the context-feature pipeline produces immediately visible quality degradation. Logging, monitoring, and graceful degradation matter more than for stable offline features.

Real-context training data

The hard part is making the model learn to use the feature when training data doesn't have the feature attached. See concepts/pseudo-context-augmentation for Pinterest's approach.

Caveats

  • Generic concept; not Pinterest-coined. The term "real-time context" is widely used in recsys / ranking literature; this page summarises Pinterest's wiki-context use.
  • Distinction from streaming features. Streaming features (last 5-minute click count, last-hour conversion count) are derived from real-time data but typically batched at small intervals; real-time context features (current Pin, current query) are session-state, not aggregated metrics.
  • Cold start specific to context-feature models. Users with no current session have no context feature; the model needs a graceful-degradation posture (often: fall back to context-free prediction or use a sentinel zero vector).

Seen in

Last updated · 542 distilled / 1,571 read