Skip to content

CONCEPT Cited by 1 source

One definition, many runtimes

Definition

One definition, many runtimes is a platform organising principle: "Define a signal or event type once, then instantiate it consistently across multiple runtimes." (Source: sources/2026-05-21-pinterest-making-user-sequence-data-more-cost-efficient-faster-and-easier-to-use).

A single configuration surface captures everything about a signal or feature pipeline — which raw events to use, which enrichments to apply, how to assemble the output — and that same definition is then consumed by multiple distinct execution runtimes:

  • Real-time / streaming indexing for low-latency updates.
  • Batch indexing + backfill for historical data + corrections.
  • Online serving for inference-time fetches.

The principle makes the definition the source of truth, not any of the runtimes.

Failure mode it prevents

The split-brain failure mode (Source: sources/2026-05-21-pinterest-making-user-sequence-data-more-cost-efficient-faster-and-easier-to-use):

"Training pipelines build sequences one way from batch tables while serving systems assemble sequences a different way from online stores. Over time, those two views naturally drift apart in subtle ways."

Without a shared definition, every runtime grows its own implementation of the conceptually-same logic. As event types, enrichments, filters, and schemas evolve, the runtimes drift independently. The result is online-offline discrepancy at the data-substrate layer: features in training don't match features in serving, and the discrepancy compounds the longer the platform runs.

Mechanism

Two structural pieces make one-definition-many-runtimes mechanically realisable:

  1. A portable definition format — Pinterest uses Python configs validated and compiled to portable JSON in object storage. The format is consumable by any runtime, in any language. See configuration-as-code feature pipeline.
  2. A shared execution engine — the same engine + executor logic runs in streaming and batch jobs. Runtime-specific concerns (concurrency model, IO substrate, scheduling) live in the framework; business logic (filtering, featurisation, raw → normalised mapping) lives in pluggable executors. See shared execution engine + pluggable executors.

Without both pieces, a "shared definition" is theatrical: each runtime ends up reimplementing the logic anyway. Pinterest's design enforces both — the same compiled config drives the same engine logic in every runtime.

Application: Pinterest user-sequence platform

Pinterest's user-sequence platform applies one-definition-many-runtimes across:

  • Real-time indexer (streaming).
  • Batch indexer + backfill pipeline.
  • Online serving API (request-time enrichments + final selection).

All three runtimes consume the same Python-defined → JSON-compiled signal definitions. The streaming and batch paths additionally share executor logic via the shared execution engine, so the lambda architecture split is just two scheduling shapes of the same logic — not two reimplementations.

Why it's harder than it sounds

  • Runtime semantic mismatch. Streaming engines, batch engines, and online-serving stacks have different evaluation models (record-at-a-time vs DataFrame, exactly-once vs at-least-once, push vs pull). The shared definition must compile down to behavior that's equivalent under each runtime's semantics.
  • Late-arriving + corrected data. The streaming runtime sees events in arrival order; the batch runtime can re-process out-of-order corrections. The definition needs to specify behavior for both without sacrificing freshness in the streaming path.
  • Schema versioning. A definition lives across many simultaneous runtime instances (jobs in canary, jobs in production, batch backfills running through old data). Versioning + rollout policy needs to be part of the definition surface.
  • Test surface. Every change to the definition needs to be testable against representative data in all the runtimes it drives, otherwise the principle silently degrades into one-runtime-per-team again.

Sibling concepts

Seen in

Last updated · 542 distilled / 1,571 read