Skip to content

CONCEPT Cited by 2 sources

Edge filtering

Edge filtering is the pipeline design move of dropping / matching events at the producer (the host Agent, sidecar, or data plane closest to the event source), instead of forwarding everything to a central backend and filtering there.

The general argument:

  • Volume at the producer edge is dominated by noise; match rates for central detection or aggregation rules are typically a small single-digit percent.
  • Serializing + transporting that noise costs CPU, memory, and network at the producer, and storage/compute at the backend.
  • Filtering locally converts an O(events) transport into O(matches) transport — typically one to three orders of magnitude smaller.

Two canonical instances

Security agent (file-event monitoring)

Datadog's Workload Protection FIM Agent evaluates detection rules locally on each host before forwarding events. Input is ~10B file-related events/min fleet-wide at ~5 KB each (multi-TB/s naively); after Agent-side rule evaluation + concepts/in-kernel-filtering, only ~1M events/min — the matches — are forwarded to the backend for detection and notification. (Source: sources/2025-11-18-datadog-ebpf-fim-filtering)

Metrics pipeline (streaming aggregation)

Airbnb's vmagent tier drops per-instance labels in transit, aggregating metrics across a service's instances before they hit storage. Same shape: producer-side reduction of volume that the backend would have to discard anyway. Typically the cheapest 10× cost lever in a metrics pipeline. (Source: sources/2026-04-16-airbnb-statsd-to-otel-metrics-pipeline; see concepts/streaming-aggregation, systems/vmagent.)

Design concerns

  • Where does the filter spec live? Rules/aggregations have to be pushed to the edge. This is a control-plane / data-plane split (concepts/control-plane-data-plane-separation) — the backend or rule engine compiles policies; agents consume them.
  • How much state can the edge carry? eBPF maps (bounded LRU), local memory in a sidecar, local cache in an Agent — the amount of learned / dynamic filter state is capped.
  • Blast radius of a bad filter. Dropping a would-match event at the edge is invisible to the backend. Requires conservative design (patterns/approver-discarder-filter as one shape).
  • Rule rollout lag. Adding a new detection rule requires pushing it to every agent — slower than a central change.

Relationship to in-kernel filtering

concepts/in-kernel-filtering is edge filtering taken one layer deeper: the edge itself (the Agent) further pushes the filter into the kernel via eBPF, so even the edge doesn't touch noise events. Edge filtering + in-kernel filtering compose additively — kernel drops ~94%, Agent drops more via richer rules, only remaining matches cross the network.

Seen in

Last updated · 200 distilled / 1,178 read