PATTERN Cited by 1 source
Temporal bucketed intersection¶
Problem¶
Compute the intersection of continuous-time annotations from independent producers (multiple ML models, multiple modalities) in a way that is efficient to ingest, store, and query at catalog scale.
Naïve interval arithmetic is O(N²) per asset — every
annotation pair checked for overlap. With dozens of modalities
× hundreds of detections per asset × catalog-scale asset counts,
this is not tractable online.
Solution¶
Three-step bucket-then-intersect algorithm:
- Bucket Mapping — segment every continuous annotation
[t_start, t_end]into a set of fixed-size time buckets (e.g. 1-second). A[2s, 8s]annotation expands to buckets[2-3, 3-4, 4-5, 5-6, 6-7, 7-8]. See concepts/temporal-bucket-discretization. - Annotation Intersection — for each bucket, collect all annotations (across all modalities) that include it. Bucket is the join key; intersection is key-equality set-union.
- Optimised Persistence — write a single enriched record
per
(asset_id, time_bucket)containing all co-occurring annotations. The record is the output; the asset + bucket tuple is the natural composite key. See concepts/composite-key-upsert.
Canonical instance¶
Netflix's multimodal video-search pipeline (sources/2026-04-04-netflix-powering-multimodal-intelligence-for-video-search), worked example:
- Character annotation: "Joey" from seconds 2 through 8
→ 6 one-second buckets
[2-3 .. 7-8]. - Scene annotation: "kitchen" from seconds 4 through 9
→ 5 one-second buckets
[4-5 .. 8-9]. - Intersection (shared buckets):
[4-5, 5-6, 6-7, 7-8]— the four seconds where both are simultaneously true. - Output: 4 enriched records, one per shared bucket, each
carrying both the
CHARACTER_SEARCHandSCENE_SEARCHchild annotations.
Why it works¶
- Complexity collapses from
O(N²)toO(N × B)whereBis the bucket count per annotation. For bounded-length annotations,Bis small and constant. - Bucket-keyed storage is naturally shardable — partition
by
(asset_id, bucket)and parallelism falls out. - Multi-producer friendly — new modalities just contribute more entries per bucket without changing the algorithm.
- Model re-runs are trivially idempotent under
composite-key upsert on
(asset_id, bucket).
Granularity trade-off¶
- Bucket too coarse → loses timing precision; annotations shorter than the bucket merge with neighbours.
- Bucket too fine → amplifies storage cardinality and index size for no expressive gain.
Netflix uses one-second buckets in the worked example; the production bucket size isn't disclosed in the post.
Sibling pattern¶
patterns/sliding-window-rollup-aggregation — Netflix's Distributed Counter. Same fixed-window discretization applied to events (counter increments) instead of interval annotations; same bucket-as-idempotent-aggregate record shape.
Caveats¶
- Bucket size is a global design choice; non-uniform bucket sizes across different modality pairs aren't expressible in this scheme.
- Intersections that require sub-bucket resolution (e.g.
"character present at least 0.5 s within bucket") aren't
preserved — the original
time_rangeon the child annotation record is, but the intersection "record" doesn't compute it. - Bucket-level records lose native interval-algebra expressive power; queries for "how long did the intersection last?" need downstream aggregation across contiguous buckets.