CONCEPT Cited by 1 source

Telemetry TTL as a one-way door¶

Telemetry data has a TTL. Metrics, logs, and traces expire on retention windows — vendor-imposed or self-imposed. Once expired, the structure and relationships across signals at that moment in time cannot be reconstructed. Any process that depends on re-deriving an old investigation context is therefore a one-way door: its inputs decay on a fixed clock, and the decision to defer any reconstruction work is a decision to permanently lose it.

Where it bites¶

The canonical case is agent evaluation. An evaluation label's world snapshot is built by walking the telemetry graph that existed at the time of the incident. If labels are created narrowly (only signals directly tied to the root cause), and the team later discovers that noise must be included to make the eval predictive, the only labels that can be widened are those whose source telemetry hasn't yet expired. The rest must be discarded.

Datadog quantifies the cost of one such regeneration:

Pass rate dropped ~11% (wider, noisier snapshots are harder).
Label count dropped ~35% (narrow labels whose source had already expired were unrecoverable).

"Snapshotting telemetry is a one-way door. Once telemetry expires, its structure and signals cannot be reconstructed." — sources/2026-04-07-datadog-bits-ai-sre-eval-platform

Design implications¶

Snapshot widely, early, often. The cost of an over-wide snapshot is storage; the cost of an under-wide one is a label that can't be fixed when you realise what you missed.
Snapshot queries, not raw bytes. Raw telemetry expires; the schema of the world (which queries against which data sources with what filters) is cheap to store and survives TTL if the downstream data stores still retain enough to replay. But queries themselves can't resurrect already-expired raw data — they still live inside the TTL window.
Label quality is a leading indicator. If labels created today need to be viable in 12 months of eval iteration, treat missing information in today's snapshots as a permanent gap.
Accept short-term regression for long-term fidelity. The Datadog team explicitly took an ~11%/~35% short-term hit on their headline numbers to get evals that predicted production behaviour. A metric-as-goal evaluation team will not.

Generalisation beyond agent evaluation¶

The pattern applies anywhere reconstruction depends on perishable observability data: postmortem analysis, compliance replay, threat-hunt retrospectives, long-running A/B-test effects. Any process that "we can always go back and recompute" against telemetry with retention < the analysis horizon has embedded a one-way door.

Seen in¶

sources/2026-04-07-datadog-bits-ai-sre-eval-platform — canonical articulation; motivates deliberate over-snapshotting in the Bits AI SRE evaluation platform.