Skip to content

SYSTEM Cited by 1 source

Pinterest user-sequence platform

Definition

The Pinterest user-sequence platform is the multi-tenant data substrate that ingests, filters, enriches, and serves user event sequences"an ordered list of recent, relevant events for a user, along with the enrichments (signals) attached to each event" — for ranking, retrieval, and recommendation models across Related Pins, Home Feed, Search and other Pinterest surfaces. It is the substrate that produces the ~16K-token sequences fed into the Pinterest Foundation Model and TransAct (Source: sources/2026-05-21-pinterest-making-user-sequence-data-more-cost-efficient-faster-and-easier-to-use).

What it does

The platform exposes one contract to consumers:

"Request sequence X for user U, and you'll get a well-defined schema of enriched events, with a documented freshness and completeness profile."

Behind this contract sit four pipeline stages that map every raw event into a normalised, enriched sequence record:

  1. Ingest — events from streaming (Kafka) and batch (data-warehouse tables, log archives, snapshots) sources.
  2. Filter — predicate-based selection of the subset of events that matter for a given signal definition.
  3. Enrich — apply embeddings, contextual features (surface / device / country), and derived attributes / counters.
  4. Assemble — produce the stable, well-defined sequence representation.

The same contract serves three workload types: training datasets (offline batch reads of long history windows), offline analysis (data-scientist queries over sequence data), and online inference (real-time fetch of up-to-date sequences for ranking / retrieval at request time).

Six-component architecture

Pinterest's redesign gives the platform six major pieces that compose into the contract:

  1. Ingestion (stream + batch) — streaming for real-time events, batch for warehouse tables / log archives / snapshots.
  2. Enrichment + execution layer — shared execution engine that turns raw events into enriched records based on configuration: filters, joins, transforms. Same engine powers streaming and batch.
  3. Real-time indexer — streaming job that filters incoming events, normalises, applies enrichments, and writes incremental updates to a time-versioned store suitable for low-latency reads.
  4. Batch indexer + backfill pipeline — scheduled batch jobs that read historical raw events, apply the same filter and enrichment definitions, and produce longer sequences plus reusable intermediate datasets for backfills and offline consumption.
  5. Columnar, time-partitioned storage — sequence data in a columnar layout so models can read exactly the columns they need; time-partitioned to keep writes + scans bounded as history grows; supports both long-sequence use cases and efficient truncation for shorter windows.
  6. Online serving API — exposes the platform contract: takes a sequence/feature name + user identifier, fetches the right columns from storage, performs request-time enrichments, applies any final selection or trimming logic ("last N events within this time window").
ingest ─┬─► engine ──► indexer ──► columnar storage ──► serving API ──► consumer
        │                              ▲
        │     (streaming path: low-latency, "now" view)
        └─► engine ──► batch indexer ──┘
                                   (batch path: long sequences,
                                    backfills, late-arrival corrections)

Design decisions

The platform is canonicalised by four design decisions plus a migration discipline:

1. Configuration-as-code for sequence + enrichment definitions

Python configs with a well-defined schema describe sequence features, event types (sources + filters + enrichments), and enrichment definitions (how to fetch + map signals). Validated, compiled to portable JSON, stored in managed internal object storage, consumed by streaming + batch + serving jobs.

Named benefits:

  • Onboarding new event types or enrichments → primarily configuration + small isolated executor code, not new bespoke pipelines.
  • Diffs human-readable; code-owner review; rollbacks straightforward; version history in standard VCS.
  • Clear separation: ML / product teams own what (events, features, filters); platform owns how (reliable + efficient execution).

2. Shared execution engine with pluggable executors

Framework + plugin contract:

  • Framework owns: data-source / sink wiring, concurrency, retries, backpressure, configuration parsing + validation.
  • Executor is a plugin that converts a raw event into one or more enriched records — the "business logic module" for a particular event type or grouping. Owns event-type-specific filtering, featurisation, and raw → normalised mapping.

The same engine + executors run in streaming and batch — minimising code duplication and reducing drift between batch and real-time behavior.

3. Lambda architecture for fresh + complete sequences

The streaming and batch paths cooperate rather than compete:

  • Streaming path → near-real-time view of user sequences for online inference. Owns "now".
  • Batch path → periodic recompute of enriched events + sequences from raw historical data. Produces long sequences + reusable datasets for backfills and analysis. Owns "fixing history" (late events, corrections, backfills).

Cost is reduced vs classical Lambda because executor logic is shared across the two paths via the shared execution engine — only the scheduling shape differs.

4. Columnar, time-partitioned storage with table semantics

Replaces pre-redesign "large, consolidated 'enriched event' blobs" where every read pulled the whole payload regardless of which features the consumer actually used.

  • Each enrichment / feature → its own column.
  • Reads → select only required columns.
  • Writes + scans → bounded to relevant time partitions.
  • Layout supports both long-sequence use cases and short-window truncation.

Operational + efficiency wins:

  • Better compression + lower network bandwidth (no fat blobs).
  • Bounded I/O even as history grows.
  • Familiar table abstractions for inspecting anomalous days / event types, validating new enrichments, comparing pipelines side-by-side.

Migration: event-type-by-event-type shadow cutover

For each event type:

  1. Run new pipeline in parallel with legacy → "shadow sequences".
  2. Two-tier comparison — event-level field-by-field on matched events + sequence-level shadow vs legacy.
  3. A/B experiments on new-data sequences (sequences are model inputs, downstream model behavior is the ultimate validation).
  4. Controlled cutover: shift consumers to new architecture.
  5. Iterate to next event type; deprecate legacy path incrementally.

100% match is not the goal — "approximately the same sequences" with sufficient validation evidence.

Operational readiness

Pinterest dashboards the platform along the same axes as the sequence-quality dimensions it promises consumers:

Quality dimension Dashboard
Freshness sequence freshness + lag
Completeness event + enrichment coverage
Stable schemas schema drift + configuration rollout status
Tenant SLO serving latency + error rates

"A platform that many teams rely on will eventually have bad days; the difference between a minor blip and a major incident often comes down to whether you can quickly see what went wrong and where."

Outcomes (qualitative per company policy)

  • Cost — significant infrastructure cost reductions on storage / replication / network as event types migrated.
  • Productivity — onboarding time for new enrichments + event types dropped substantially; most changes are configuration + small executor code rather than bespoke pipelines.
  • Quality — major recommendation surfaces saw improved engagement metrics post-migration.

Future work

  • Self-serve tooling — wizards for new signals, static analysis on configurations, automated backfill orchestration for common patterns. Goal: filling out a template instead of editing infrastructure code.
  • Stronger correctness guarantees — anomaly detection over indexing + serving paths.
  • Richer signals — extend coverage to more event types + surfaces; add session-level / object-level abstractions on top of raw event sequences while preserving the "events → enriched signals → sequences" contract.

Why it matters

The platform is the structural answer to online-offline discrepancy at the data-substrate layer. Instead of debugging features-not-matching after model launches, the "one definition, many runtimes" organising principle plus the shared execution engine make definition divergence between training and serving architecturally impossible. Subsequent Pinterest model-side scaling work — e.g. the sources/2026-04-13-pinterest-scaling-recommendation-systems-with-request-level-deduplication|100× Foundation Model scaleup absorbed via request-level deduplication — is upstream-fed by the user sequence quality this platform produces.

Seen in

Last updated · 542 distilled / 1,571 read