CONCEPT Cited by 1 source

Offline compute / online lookup¶

Definition¶

Offline compute / online lookup is an architectural split in which an expensive analysis runs asynchronously on a batch cadence, produces a small configuration artefact, and publishes it to a fast runtime lookup store. Runtime consumers load the artefact once at initialisation and do in-memory lookups per request — never re-running the expensive analysis on the serving path.

It is the canonical answer to "we need to use the output of an expensive computation in a latency-sensitive hot path." The pattern shows up whenever:

The computation is too slow or too costly to run on every request.
The underlying phenomenon changes slowly enough that periodic re-computation is acceptable.
The output is small enough to fit into runtime memory or a fast key-value store.

Pinterest's MIQPS framing¶

The canonical wiki instance is MIQPS (Source: sources/2026-04-20-pinterest-smarter-url-normalization-at-scale-how-miqps-powers-content-deduplication). The three phases:

Continuous ingestion writes observed URLs to a per-domain corpus on S3 as a side effect of normal content processing.
Offline MIQPS job downloads the corpus, runs the algorithm, performs anomaly detection, and publishes the MIQPS map to a config store + archives to S3.
Runtime URL normaliser loads the MIQPS map at init and does in-memory lookups per URL.

Pinterest's framing:

"This separation of concerns means the expensive content ID comparison happens offline and asynchronously, while runtime URL normalization is a fast, in-memory lookup."

Why offline, not real-time — three load-bearing reasons¶

Pinterest articulates the hypothetical alternative and why they rejected it:

"An alternative design would be to determine parameter importance in realtime — rendering the page with and without each parameter at the moment a URL is first encountered. This would eliminate staleness entirely and provide immediate coverage for newly discovered domains. However, we chose the offline approach for several reasons:

Latency: Each content ID computation requires rendering a full page, which takes seconds. Testing every parameter in a URL would multiply this cost, adding unacceptable latency to the content processing pipeline.

Cost: Offline analysis scales with the number of domains, while realtime analysis would scale with the number of URLs — orders of magnitude more expensive.

Reliability: Transient rendering failures in an offline job are isolated and retryable. In a realtime path, they would directly block content processing."

Scaling regime — domains vs URLs¶

The scaling-denominator switch is the core win: offline analysis's compute scales with the slower-growing axis (number of distinct domains / patterns), while realtime analysis would scale with the fast-growing axis (number of URLs seen). For Pinterest with "hundreds of thousands of domains" and "billions of URLs", the difference is ~4-5 orders of magnitude.

The pattern applies wherever the "one-per-domain" signal is useful for "one-per-request" decisions.

The staleness trade-off¶

Offline compute guarantees the runtime is using stale data (config from the last offline run). Pinterest's rationale for why this is OK:

"URL parameter conventions change infrequently — on the order of weeks or months. The small amount of staleness between computation cycles is an acceptable tradeoff for the massive savings in cost, latency, and operational complexity."

The decision framework:

Acceptable staleness ≥ re-computation cadence → offline is fine.
Acceptable staleness < re-computation cadence → need online updates (streaming / reactive / per-request computation).

Pinterest's ~weeks-or-months > ~per-domain-cycle cadence, by orders of magnitude.

Contrast with streaming / reactive architectures¶

Streaming: upstream events trigger incremental re-computation in near-real-time (e.g. Kafka → Flink → materialised view). Fresh but more complex.
Reactive / lazy: compute on first request, cache, invalidate on demand. Fresh if touched recently, stale otherwise, unpredictable.
Offline batch + online lookup (this concept): compute periodically, ignore intra-batch changes, accept bounded staleness. Simplest ops story.

Pinterest explicitly chose the simplest option and paid for it with bounded staleness — a choice that's optimal when the phenomenon is slow-moving.

patterns/offline-compute-online-lookup-config — the pattern formulation of this concept.
patterns/offline-train-online-resolve-compression — offline ML training + online inference; same shape.
patterns/offline-teacher-online-student-distillation — offline teacher model + distilled online student; same shape.
concepts/anomaly-gated-config-update — typically sits between the offline compute and online publish steps.

Generalisation¶

Any hot-path decision that benefits from an expensive derived config:

Rate limiting: compute per-tenant tier offline from usage history, look up per-request at runtime.
Routing: compute placement plans offline from capacity data, look up per-request.
Pricing: compute price-elasticity models offline, look up per-request.
Config-driven validation: compute policy lattice offline, evaluate inline.

Seen in¶

sources/2026-04-20-pinterest-smarter-url-normalization-at-scale-how-miqps-powers-content-deduplication — canonical Pinterest wiki introduction.