PATTERN Cited by 1 source
Shared kernel resource coordination¶
Shared kernel resource coordination is the pattern of treating certain kernel-object namespaces — TC priorities and handles, cgroup program attach ordering, XDP program slots, LSM hook chains — as an inter-vendor protocol when multiple independent eBPF (or eBPF-like) tools coexist on the same host.
Without explicit coordination, independently correct products can collide on these namespaces and cause real outages — the 2022 Datadog × systems/cilium TC handle-collision incident is the named case study (Source: sources/2026-01-07-datadog-hardening-ebpf-for-runtime-security).
The shape of the collision¶
Two eBPF tools attach programs to the same kernel attachment surface:
- Same program type (e.g. both use
SCHED_CLSTC classifiers). - Same hook point (e.g. pod network interface in Kubernetes).
- Shared identifier space (e.g. TC priority, TC handle, cgroup program array slot) — contended without explicit protocol.
Timing + implicit assumptions about who owns what then decide the outcome:
- If one tool hard-codes priority=1 and handle=
0:1, and the other loads first at the same priority, the second is silently "above" or "below" the first — which breaks assumptions on either side. - If any tool reacts to unexpected handle changes (e.g. as a namespace-leak signal) by deleting those resources, it will delete the other vendor's programs.
The Datadog × Cilium case¶
- Setup. Datadog systems/datadog-workload-protection uses TC
classifier programs to inspect pod-network packets. Cilium uses
TC programs for pod connectivity, with a hardcoded priority (1)
and handle (
0:1). - Race. On pod bring-up, Datadog's Agent sometimes loaded its
TC filters before Cilium, taking handle
0:1under Cilium. - Cleanup misfire. When Cilium later loaded and replaced Datadog's filters, Datadog's cleanup logic — designed to prevent network-namespace resource leaks — saw the handle change as a cleanup signal and deleted Cilium's filters.
- Outage. Pods lost connectivity entirely until manual restart.
Mitigations¶
From Datadog's post, generalisable as the pattern:
- Safer defaults for shared namespaces. Datadog raised its TC priority default to 10 so infrastructure (CNI) classifiers run first. Similar defaults should be picked knowing what infrastructure / platform tooling tends to claim.
- Conservative cleanup. Hardened cleanup against races; default to never auto-deleting queuing disciplines and shared kernel resources — the worst case of "leak until manual intervention" is still better than the worst case of "break another vendor".
- Vendor coordination. Document priority conventions and hardcoded handles; talk to peer vendors proactively (Cilium, Isovalent, Falco, etc.). Same shape as xDS-style coordination on a different substrate.
- Co-resident detection + warnings. Inventory who else is
using
bpf(2)on the host; alert when a new eBPF vendor appears or when another process is suspected of disabling / interfering with monitoring. - Published interface. Ideally, shared kernel-resource namespaces grow a public "here's who uses what, in what order, on what hook" convention — implicit protocols should become explicit.
Applicability beyond TC¶
The same pattern applies on:
- Cgroup-attached program ordering — documented kernel rules already govern ordering, overrides, and chaining. The operational lesson: read those rules before shipping cgroup programs alongside other vendors.
- XDP program slots — one program per device without special chaining infrastructure.
- LSM hook chains — BPF LSM programs share stacking with other LSMs.
- BPF map pinning in
/sys/fs/bpf/*— shared pin paths across vendors need a naming protocol.
When to reach for it¶
Any time your product attaches eBPF programs to hooks that are realistically shared on a customer host with third-party eBPF tooling. In modern Kubernetes that is "always" for anything touching pod networking or cgroup hooks.
Seen in¶
- sources/2026-01-07-datadog-hardening-ebpf-for-runtime-security — Datadog × systems/cilium TC incident + mitigations.
Related¶
- systems/ebpf
- systems/datadog-workload-protection
- systems/cilium
- concepts/ebpf-verifier — orthogonal: within-vendor variability vs. this pattern's cross-vendor variability