CONCEPT Cited by 1 source
Late-arriving data¶
Definition¶
Late-arriving data is event data that reaches the analytics / streaming system after the "wall clock" bucket that the event belongs to has already started being processed or queried. Events can be delayed for many reasons — network partitions, client buffering, mobile offline → online syncs, upstream backpressure, processing lag in the ingestion pipeline. The result is that the value of a recent time bucket is not yet final — more events may still land in it.
Two forcing-function properties¶
- Recent buckets are unstable. A data point from 30 seconds ago might still change as late-arriving events trickle in.
- Old buckets are final. A data point from 30+ minutes ago is almost certainly settled.
This asymmetry is the forcing function behind several real-time- analytics design choices.
Design consequences¶
(a) Query-time watermarks¶
Many operational dashboards explicitly set the right boundary to
now - 1m or now - 5s to avoid the "flappy tail" of
currently-arriving data flickering across refreshes. Netflix's
Druid-cache post names this pattern directly (Source:
sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid).
(b) Age-based exponential TTLs in caches¶
Because recent buckets are unstable and old buckets are final, a cache that serves time-series data can safely cache old buckets for a long time and must refresh recent buckets quickly. The canonical wiki instance is Netflix's Druid cache — 5 s TTL for <2-min-old buckets, doubling per additional minute of age, capped at 1 hour.
(c) Druid's cache conservatism¶
Apache Druid itself refuses to cache query results that involve realtime segments (still-being-indexed buckets) in its built-in full-result cache, precisely because late arrivals could invalidate the result. Druid prefers deterministic, stable cache results and query correctness over a higher hit rate (Source: sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid).
(d) Trailing-bucket exception in negative caching¶
Netflix's Druid cache will negative-cache empty interior buckets (genuinely sparse metrics) but not empty trailing buckets — because an empty trailing bucket might just be data that hasn't arrived yet.
(e) Streaming-system watermarks¶
The streaming-system world (Flink, Spark Structured Streaming, Dataflow / Beam) has the same problem at an earlier layer: watermarks and allowed-lateness are how stream processors bound how long to wait for late events before closing a window. Netflix's Druid cache is solving a downstream version of the same problem at the query / dashboard layer rather than the aggregation layer.
Netflix's pipeline latency number¶
Netflix's end-to-end data pipeline latency is <5 s at P90 (Source: sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid). This number bounds how much late-arrival ambiguity is likely: after 5 s, most events that were ever going to arrive have arrived. That bound is what justifies the cache's 5 s minimum TTL on the freshest bucket — the cache adds no meaningful staleness on top of what the pipeline's P90 lag is already introducing.
Relationship to ordering¶
Late-arriving-data is not the same as out-of-order events on the same key — but they co-occur. Late arrivals are usually also out-of-order with respect to wall-clock ingestion order, and event-time-ordered processing (rather than arrival-time-ordered) is the standard mitigation. The concepts/last-write-wins page notes "late-arriving older events don't clobber newer values — LWW is aligned with the application's ordering, not the arrival order" — which is the same trade-off seen from the write-side.
Seen in¶
- sources/2026-04-06-netflix-stop-answering-the-same-question-twice-interval-aware-caching-for-druid — Netflix's explicit framing + pipeline-latency number + the forcing function for the cache's exponential-TTL ladder + the trailing-bucket negative-caching exception.
Related¶
- concepts/exponential-ttl — the main caching response.
- concepts/negative-caching — and its trailing-bucket exception.
- concepts/granularity-aligned-bucket — how the bucket model carries the late-arrival information.
- concepts/rolling-window-query — the workload shape.
- concepts/last-write-wins — the write-side analog.
- systems/apache-druid — where late-arriving segments live.
- systems/netflix-druid-interval-cache — the caching layer.
- patterns/interval-aware-query-cache