SYSTEM Cited by 1 source
Netflix Atlas¶
Atlas is Netflix's primary telemetry / metrics platform — an in-memory, dimensional time-series database for operational metrics. Introduced publicly in 2014 and open-sourced as github.com/Netflix/atlas; its canonical introduction is the 2014 Netflix Tech Blog post Introducing Atlas.
This wiki page is a stub: Atlas is referenced here only as the metrics backend in the noisy-neighbor eBPF monitor. A dedicated Atlas source page has not yet been ingested; expand this page when the 2014 introduction or a later architecture post is distilled.
Role in the noisy-neighbor eBPF monitor¶
The Go userspace agent in the runq.latency monitor emits two Atlas metrics per container:
runq.latency— a percentile timer (Atlas's histogram- style metric) dimensioned by container ID.sched.switch.out— a counter dimensioned by container ID and tagged with the preemption-cause category (same_cgroup/different_container/system_service) — the tag dimension that makes the dual-metric disambiguation pair interpretable.
The percentile-timer + tagged-counter pair is a reusable Atlas emission shape for any scheduler / queue observability use case.
Seen in¶
- sources/2024-09-11-netflix-noisy-neighbor-detection-with-ebpf
— Atlas is the metrics sink;
runq.latencypercentile timer +sched.switch.outcounter are the two metric shapes used.