CONCEPT Cited by 1 source
User event sequence¶
Definition¶
A user event sequence is "an ordered list of recent, relevant events for a user, along with the enrichments (signals) attached to each event" (Source: sources/2026-05-21-pinterest-making-user-sequence-data-more-cost-efficient-faster-and-easier-to-use). Each event in the sequence carries a timestamp, an action type, the surface where the action occurred, and a payload of enrichments: embeddings (Pin / query representations), contextual features (surface, device, country), and derived attributes / counters describing how the user interacted with content over time.
Concrete Pinterest example: "a sequence made up of the last 500 engagements a user had with Pinterest Pins", where each event might carry a timestamp + action type + surface + a handful of embedding features or categorical attributes.
Why sequences are a first-class primitive¶
User event sequences capture temporal behavior — the order, timing, and shape of actions — rather than just aggregates like "how many clicks last week". This makes them the substrate for:
- Sequence-aware models — Transformers, sequence encoders, attention-over-history architectures (Pinterest examples: Foundation Model, TransAct, Contextual Sequential CG).
- Cross-workload reuse — because they preserve relatively raw behavior, the same sequence can feed ranking, retrieval, exploration, anomaly detection.
- Compositional features — derived attributes / counters / aggregations can be computed downstream from the raw sequence rather than landed as separate features.
Why "the N latest events from a log table" is wrong¶
A high-quality sequence is not "the N latest events from a log table". It is the result of a multi-step process (Source: sources/2026-05-21-pinterest-making-user-sequence-data-more-cost-efficient-faster-and-easier-to-use):
- Ingest — events from diverse sources (streaming + batch).
- Filter — narrow to the subset of events that matter for this signal.
- Enrich — attach embeddings, metadata, derived attributes per event.
- Assemble — produce a stable, well-defined sequence representation.
Each stage has a distinct platform concern: source connectors + ingestion, filtering predicates, enrichment services + joins, and assembly + schema. Treating "sequence" as the output of this multi-stage pipeline — rather than as "the row tail" — is what enables one definition, many runtimes to make sense.
The four-dimensional quality contract¶
Sequences serve training + offline analysis + online inference simultaneously, so sequence quality is multi-dimensional. See sequence quality dimensions:
- Freshness — how quickly new events / enrichments show up.
- Completeness — late-arriving events / corrections / backfills are eventually reflected.
- Consistent enrichment — same enrichments across streaming + batch; training + serving see aligned data.
- Stable schemas — versioned + predictable, not silently changed.
Workload spread¶
User event sequences appear in three infrastructure-distinct places:
| Workload | What it does | Latency profile |
|---|---|---|
| Training datasets | Pull long history windows of enriched events per user | Batch, throughput-bound |
| Offline analysis | Slice user behavior across sessions / surfaces / campaigns | Interactive query, latency-bound |
| Online inference | Real-time sequence fetch at request time for ranking / retrieval | Low-latency, p99-bound |
Pinterest's user-sequence platform serves all three from the same underlying definitions and storage substrate.
Difference from related concepts¶
- vs feature store — feature stores are about aligning training + serving fetches of features; user event sequences are a specific feature shape (ordered list of enriched events) that places harder requirements on temporal ordering, late-arrival handling, and consistent enrichment between streaming + batch paths. A feature store is one implementation substrate for sequences; the sequence concept extends to the platform that produces them.
- vs long user sequence modeling — that concept is about model-side handling of long sequences (e.g. Pinterest's ~16K tokens, context compression, attention pruning); user event sequence is the data primitive those models consume.
Seen in¶
- sources/2026-05-21-pinterest-making-user-sequence-data-more-cost-efficient-faster-and-easier-to-use — the canonical four-stage definition (ingest → filter → enrich → assemble) and the "not the N latest events from a log table" framing.
- sources/2026-04-13-pinterest-scaling-recommendation-systems-with-request-level-deduplication — same ~16K-token user-sequence object treated as the deduplication target.
- sources/2026-05-08-pinterest-enhancing-ad-relevance-integrating-real-time-context-into-sequential-recommender-models — user offsite-conversion sequences as Transformer-encoder input.