CONCEPT Cited by 1 source

Composite-key upsert¶

Definition¶

A composite-key upsert is a write operation that uses a multi-field-derived unique key (e.g. (asset_id, time_bucket)) as the stable document identifier such that a subsequent write with the same composite key updates the existing document rather than creating a duplicate.

Canonical wiki instance: Netflix's multimodal video-search pipeline indexes enriched temporal-bucket records into Elasticsearch using (asset_id, time_bucket) as the _id — "if a temporal bucket already exists for a specific second of video, perhaps populated by an earlier model run, the system intelligently updates the existing record rather than generating a duplicate. This mechanism establishes a single, unified source of truth for every second of footage" (Source: sources/2026-04-04-netflix-powering-multimodal-intelligence-for-video-search).

Why composite keys¶

Two structural reasons motivate composite-key upsert in multimodal ingestion pipelines:

Multiple producers write into the same bucket. Character recognition and scene detection both have annotations in the bucket (asset_id=A, time_bucket=[4s, 5s]). Neither can own the document's surrogate ID because neither knows in advance which modalities will co-occur.
Models get re-run (retraining, prompt change, new version). A re-run re-emits annotations for buckets that already have fused records. Without a stable composite key, re-runs produce duplicate rows; with one, they update in place.

Surrogate UUIDs fail both requirements because they embody producer identity (one-UUID-per-write) rather than subject identity (one-ID-per-asset-bucket).

The composite key as a natural idempotency key¶

Composite keys derived from subject identity double as idempotency tokens: any two writes carrying the same composite key are by construction idempotent over a last-write-wins (or merge-wins) store.

Netflix layers this with the observation that "a single, unified source of truth for every second of footage" emerges — the invariant being that exactly one Elasticsearch document exists per (asset, bucket), irrespective of how many times any producer model has emitted output for that bucket.

Compare with Netflix's other canonical composite-key idempotency instances:

KV DAL — (generation_time, nonce) is the write-level idempotency token making hedged + retried Cassandra writes safe.
Distributed Counter — (event_time, event_id, event_item_key) is the event-level idempotency key in the TimeSeries event store.
This post — (asset_id, time_bucket) is the document-level idempotency key in Elasticsearch for multi-producer + re-runs.

All three are applications of the same discipline: pick a natural key derived from the subject of the write, not the write event itself.

Elasticsearch mechanics¶

Elasticsearch's _id field accepts any string; setting _id to the string-encoded composite key and using the standard PUT /<index>/_doc/<_id> (or index with the same _id) produces native upsert semantics — the second write replaces the first. No explicit version check is required when the merge is structurally last-write-wins; for field-merged upserts the update API with a script or doc_as_upsert: true is the relevant primitive (not disclosed in the Netflix post).

Seen in¶

sources/2026-04-04-netflix-powering-multimodal-intelligence-for-video-search — canonical wiki instance. (asset_id, time_bucket) as the Elasticsearch document _id for the multimodal video-search index; makes multi-model fusion + model re-runs safe by construction.

Caveats¶

The Netflix post doesn't disclose whether the upsert is full-document replace (last-write-wins on the whole record) or field-level merge (add new annotations without overwriting existing ones). The architectural description suggests the latter, but the mechanism isn't spelled out.
Composite keys in distributed indexes can create hot shards if the composite-key hash isn't uniform (e.g. one asset dominating traffic); Netflix doesn't describe sharding strategy.
Model-version-aware semantics are not discussed — does a new version of the character-recognition model replace prior character annotations in a bucket, or accumulate alongside? Accumulation risks duplicate labels; replacement risks data loss if the retraining doesn't cover every prior detection.