PATTERN Cited by 1 source
Three-stage ingest-fusion-index pipeline¶
Problem¶
You have raw ML-model output (annotations, detections, embeddings) that needs to be searchable at second-granularity across many models of different modalities at media-catalog scale. Direct indexing of raw per-model output into a search engine fails because:
- Every model's output shape is different; a flat index can't express cross-modality queries.
- Intersection of continuous-interval annotations across
modalities is
O(M × N)— not tractable at query time. - Re-running a model must not create duplicate search documents; the index must stay a single source of truth.
- Heavy intersection computation must not bottleneck real-time ingest throughput.
Solution¶
Split the pipeline into three decoupled stages glued by an event bus:
- Transactional persistence — raw per-model annotations land in a write-optimised store (e.g. Cassandra) through a dedicated annotation service. Data-integrity + high-throughput-write is the only job.
- Offline data fusion — an event from stage 1 triggers an asynchronous job that discretizes annotations into fixed-size time buckets (concepts/temporal-bucket-discretization) and computes cross-model intersections (concepts/multimodal-annotation-intersection). Enriched records go back to the durable store.
- Indexing for search — a subsequent event triggers
ingestion of enriched bucket records into a search engine
(e.g. Elasticsearch) as nested documents
(concepts/nested-document-indexing) via composite-key
upsert (concepts/composite-key-upsert) on
(asset_id, time_bucket).
Each stage uses the tool optimised for its workload:
| Stage | Store | Optimised for |
|---|---|---|
| 1 — persist | Cassandra | high write throughput, durability |
| 2 — fuse | Cassandra (re-written enriched rows) | bulk batch compute |
| 3 — index | Elasticsearch | low-latency multimodal search |
Why it works¶
- Ingest stays fast — stage 1 doesn't wait for the heavy intersection math in stage 2. Netflix's design posture: "Cleanly decoupling these intensive processing tasks from the ingestion pipeline guarantees that complex data intersections never bottleneck real-time intake" (Source: sources/2026-04-04-netflix-powering-multimodal-intelligence-for-video-search).
- Each stage is resilient to downstream failure — stage 1 can accept writes while stage 2 is down; stage 2 can reprocess backlogs when stage 3 is down. The Kafka events in between absorb the transient decoupling.
- Re-runs are safe — stage 3's composite-key upsert
guarantees each
(asset, bucket)Elasticsearch document is updated in place, not duplicated. - Query altitude matches storage — the search index carries the fused / denormalised shape optimised for multimodal queries; the transactional store carries the raw per-model shape optimised for ingest + re-processing.
Canonical instance¶
Netflix Search's multimodal video-search pipeline (sources/2026-04-04-netflix-powering-multimodal-intelligence-for-video-search):
- Stage 1: systems/netflix-marken over systems/apache-cassandra.
- Stage 2: offline fusion triggered by systems/kafka; enriched second-by-second bucket records written back to Cassandra.
- Stage 3: systems/elasticsearch nested documents keyed by
(asset_id, time_bucket).
Adjacent patterns¶
- patterns/offline-fusion-via-event-bus — the stage-1 → stage-2 edge specifically.
- patterns/temporal-bucketed-intersection — the stage-2 internal algorithm.
- patterns/nested-elasticsearch-for-multimodal-query — the stage-3 document shape.
Caveats¶
- Three stages is three sources of eventual consistency; search results lag ingest by (fusion-job-delay + index-refresh-lag).
- Multi-stage pipelines are harder to debug than monolithic ingest paths — observability must cover the event bus, each stage's health, and end-to-end traces across all three.
- Netflix doesn't disclose the fusion-job scheduling cadence, back-pressure strategy when stage 3 falls behind, or the behaviour when a model re-emits annotations for a historical bucket.
- Model-version-aware semantics in stage 2 are open — does a new
version of a model replace or augment prior annotations for
an
(asset, bucket)?
Related¶
- concepts/temporal-bucket-discretization
- concepts/multimodal-annotation-intersection
- concepts/composite-key-upsert
- concepts/nested-document-indexing
- patterns/offline-fusion-via-event-bus
- patterns/temporal-bucketed-intersection
- patterns/nested-elasticsearch-for-multimodal-query
- systems/netflix-marken · systems/apache-cassandra · systems/kafka · systems/elasticsearch