Skip to content

SYSTEM Cited by 1 source

Netflix FlowExporter

FlowExporter is Netflix's per-host eBPF sidecar that captures TCP flow records from every Netflix workload in AWS. It attaches to kernel TCP tracepoints to observe socket state changes; when a TCP socket closes, it emits a flow log record containing the IP addresses, ports, timestamps, and additional socket statistics. Average volume is ~5 million records per second fleet-wide. Records are reported to FlowCollector in 1-minute batches.

What it does

FlowExporter resolves the local side of every flow to a workload identity at capture time — this is what makes the downstream attribution pipeline correct. Two paths:

  • EC2 path. Workloads running directly on EC2 instances get identity certificates provisioned at boot by Metatron. FlowExporter reads the cert from local disk to learn the host's workload identity.
  • Container path (Titus). A host runs many container workloads with different identities, so the lookup must be per-socket. IPManAgent writes the IP-address → workload-ID mapping into an eBPF map. When FlowExporter's BPF program sees a socket event from a TCP tracepoint, it looks up the socket's local IP in the map to resolve the owning workload.

IPv6 → IPv4 translation disambiguation. Netflix's NAT64-free IPv6-to-IPv4 mechanism intercepts connect() and replaces the underlying socket with one using a shared IPv4 host address. The kernel then reports the same local IPv4 for sockets created by different container workloads. Netflix modified Titus to write a (local IPv4 address, local port) → workload-ID mapping into an eBPF map on each intercepted connect; FlowExporter reads this second map to disambiguate.

Record shape

Each reported flow carries at minimum: (local_ip, local_workload_id, remote_ip, local_port, remote_port, t_start, t_end, socket_stats). Because local attribution happens at capture, downstream time-range heartbeats are keyed on (local_ip, local_workload_id, t_start, t_end) (concepts/heartbeat-based-ownership).

Role in the attribution pipeline

FlowExporter is one leaf of the patterns/heartbeat-derived-ip-ownership-map architecture — every close event is simultaneously a business flow record AND a heartbeat reaffirming who owns the local IP for that time window. This is the load-bearing reframing: rather than relying on a separate IP-assignment event stream (Sonar) with ordering and delivery guarantees that distributed systems don't keep, attribution emerges from the data plane itself.

Seen in

Last updated · 319 distilled / 1,201 read