Skip to content

CLOUDFLARE 2026-05-07 Tier 1

Read original ↗

Cloudflare — How Cloudflare responded to the Copy Fail Linux vulnerability

Summary

On 2026-04-29 16:00 UTC, CVE-2026-31431 — a Linux kernel local-privilege-escalation vulnerability named "Copy Fail" — was publicly disclosed by Xint Code. The bug is an out-of-bounds 4-byte write in the kernel crypto API's authencesn wrapper (the AEAD wrapper for hmac(sha256),cbc(aes)), reachable from unprivileged user-space via the AF_ALG socket family + the algif_aead module. An attacker opens an AF_ALG socket, uses splice() to chain the target file's page-cache pages (canonically /usr/bin/su) into the crypto operation's scatterlist, and on recvmsg() the out-of-bounds write taints the cached pages of the setuid-root binary. Running su then executes attacker-controlled 4-byte chunks with root privileges. The upstream fix (commit a664bf3d603d) reverts a 2017 in-place-crypto optimisation; it was backported to LTS 6.12 after the public disclosure, not before.

Cloudflare discloses its 5-day incident-response timeline from first assessment on 2026-04-29 through full-fleet mitigation by 2026-05-04. There was no customer impact, no data at risk, no service disruption — the post's thesis is that existing discipline (behavioral detection, an always-on bpf-lsm framework, the Edge Reboot Release (ERR) pipeline, fleet-wide centralised logging) absorbed a CVE whose fix had not yet reached Cloudflare's primary kernel line. Five workstreams ran in parallel — blast-radius mapping, detection validation, proactive threat hunting 48 hours back in time, runtime mitigation engineering, and scheduled kernel patching. The canonical architectural moves are: (1) behavioral detection caught the exploit pattern within minutes of internal validation, with no signature update, no rule change, no human intervention (concepts/behavioral-detection-vulnerability-agnostic); (2) a bpf-lsm eBPF program that denies the socket_bind LSM hook for AF_ALG for everyone except an explicit allow-list of known-legitimate binaries (patterns/bpf-lsm-allowlist-hook-denial); (3) a deliberately two-phase rollout: deploy ebpf_exporter fleet-wide first to measure which binaries legitimately use AF_ALG, confirm the single expected internal service is the only legitimate user, then push the bpf-lsm enforcement program behind a separate gate (patterns/visibility-before-enforcement-rollout); (4) assume-compromise threat hunting covering 48 hours of fleet-wide logs before the vulnerability was publicly known (concepts/assume-compromise-posture + patterns/fleet-wide-retroactive-threat-hunt); (5) the first mitigation attempt — unconditional algif_aead module removal — broke in staging due to a dependency conflict and was rolled back without customer impact (patterns/staging-caught-mitigation-failure). The post explicitly acknowledges a gap: despite a biweekly patching cadence, Cloudflare remained vulnerable because the upstream fix had not yet reached the LTS 6.12 line that the majority of the fleet runs (concepts/lts-kernel-backport-latency-gap).

Key takeaways

  1. Behavioral detection is vulnerability-agnostic. When engineers validated the exploit internally on 2026-04-29 22:52 UTC, existing endpoint detection flagged the activity within minutes"without a signature update, without a rule change, and without human intervention" (verbatim). It linked the full execution chain from script interpreter → kernel crypto subsystem → setuid binary. Canonical instance of concepts/behavioral-detection-vulnerability-agnostic.

  2. bpf-lsm is the fleet's runtime-mitigation primitive. Cloudflare had already built and deployed a generic bpf-lsm framework for exactly this scenario. For Copy Fail, a bpf-lsm program denies the socket_bind LSM hook for the AF_ALG address family whenever the calling binary is not on an explicit allow-list of legitimate AF_ALG users. The algif_aead module stays loaded for legitimate traffic; exploit attempts see PermissionError: [Errno 1] Operation not permitted.

  3. Visibility before enforcement — two separate gates. Rolling out enforcement without first measuring legitimate usage would risk breaking internal services that rely on AF_ALG. Cloudflare used prometheus-ebpf-exporter to hook the socket() syscall and track per-binary AF_ALG usage across the fleet — no kernel changes, data from hundreds of thousands of servers within hours. Only after confirming the one known internal service was the sole legitimate user was the bpf-lsm enforcement program pushed behind a separate gate. Canonical instance of patterns/visibility-before-enforcement-rollout.

  4. Assume compromise until proven otherwise. Security's standard posture for critical vulnerabilities is to assume exploitation could have occurred before public disclosure and to work systematically to either confirm or rule it out. For Copy Fail: 48 hours of kernel logs across the full fleet were searched for the exploit's distinctive signature; access logs were reconstructed for affected systems; cryptographic hashes of system binaries were validated against known-good package manifests; network connections and persistence mechanisms were audited. "Everything was clean." Canonical wiki instance of concepts/assume-compromise-posture + patterns/fleet-wide-retroactive-threat-hunt.

  5. The LTS backport latency gap is the structural hazard even with a biweekly patching cadence. Verbatim quote: "despite our practice of deploying Linux patch updates every two weeks, we remained vulnerable because a month-old mainline fix had yet to be backported to our primary kernel line." Canonical first-class articulation of concepts/lts-kernel-backport-latency-gap — the window between when a mainline Linux fix lands and when it reaches the LTS series an operator runs is not covered by "we patch often." Runtime mitigations (patterns/bpf-lsm-allowlist-hook-denial) cover the gap.

  6. Edge Reboot Release (ERR) ships patched kernels on a 4-week reboot cycle; control-plane adopts the latest. Cloudflare runs a custom Linux kernel built from community LTS releases (6.12 majority + 6.18 early adopters at disclosure time). An automated job generates a new internal kernel build weekly from upstream LTS pulls; builds flow through staging → production via the ERR pipeline on a 4-week edge-reboot cadence; control-plane runs the newest kernel with workload-specific reboot schedules.

  7. Five parallel workstreams on day one, no customer impact at any point. On disclosure (2026-04-29 16:00 UTC) Cloudflare ran in parallel: blast-radius mapping (which kernel versions are vulnerable), detection coverage validation, proactive threat hunting (48 h lookback), runtime mitigation engineering (bpf-lsm), and continued kernel-patching software updates. By 2026-04-30 evening the bpf-lsm mitigation was fleet-wide behind an enforcement gate; by 2026-05-04 morning the patched LTS 6.12 kernel was rolling via reboot automation at normal pace.

  8. First mitigation attempt broke in staging — safely. The initial plan was the researchers' recommended fix: echo "install algif_aead /bin/false" > /etc/modprobe.d/... then rmmod algif_aead. On 2026-04-29 evening the first push to the staging datacenter surfaced a dependency conflict (software legitimately using the kernel crypto API broke) and was rolled back. No production impact. Canonical instance of patterns/staging-caught-mitigation-failure — the staging layer is the fault-domain boundary for "attempt a mitigation you haven't fully characterised yet."

  9. Two-step bpf-lsm rollout beats a one-step module unload — visibility first, then enforcement. "Before enabling enforcement, we verified that our known internal service was the sole legitimate AF_ALG user to avoid accidental outages." The ebpf_exporter hook on socket() syscall created the measurement loop (aggregate per-binary AF_ALG usage across the fleet, no kernel changes). Results confirmed the identified service was the only legitimate user — then the bpf-lsm program shipped behind a separate enforcement gate.

  10. Acknowledged follow-up work, named explicitly. "Better visibility into kernel-API dependencies" — review kernel-subsystem usage across production services so future mitigations don't hit surprise dependencies; "better runtime mitigation" — faster bpf-lsm deployments, better playbooks, better logging/visibility; "reduce attack surface of Linux kernel" — proactively identify unused modules/features and remove from the build entirely. The third item is a canonical instance of attack-surface reduction at build time.

Incident timeline (all UTC)

Time (UTC) Event
2026-04-29 16:00 Copy Fail publicly disclosed (Xint Code).
2026-04-29 ~21:00 Security + Engineering begin assessment of fleet exposure + mitigation options.
2026-04-29 22:52 Security confirms existing behavioral detection covers the exploit pattern. Detection flags internal validation activity within minutes.
2026-04-29 23:01 Existing behavioral detection generates a high-severity alert for exploit-like activity.
2026-04-29 evening First mitigation push to staging datacenter (unconditional algif_aead removal). Dependency conflict surfaces; rollback. No production impact.
2026-04-29 overnight Engineering drafts bpf-lsm mitigation program.
2026-04-30 03:14 Security incident declared. Fleet-wide threat hunting of historical data begins.
2026-04-30 morning Engineering tests the bpf-lsm program; makes it production-ready.
2026-04-30 14:25 Engineering incident declared to coordinate mitigation + Linux patch rollout.
2026-04-30 ~17:00 Decision: ship a patched build of the previous LTS line through reboot automation; do not accelerate the new LTS; lean on bpf-lsm in the meantime.
2026-04-30 afternoon Visibility pipeline (eBPF tracing of AF_ALG socket usage) deployed fleet-wide. Complete picture of legitimate AF_ALG users.
2026-04-30 evening bpf-lsm mitigation rolled out behind a separate enforcement gate, fully mitigating the fleet. End-to-end verification on a previously-vulnerable test node confirms the exploit no longer works.
2026-05-04 morning Reboot automation resumes at normal pace with the patched kernel.
2026-05-04 onward Servers that had already passed through reboot automation earlier in the week manually rebooted to pick up the patched kernel. Unpatched servers update per normal reboot automation.

Architectural themes

Behavioral detection as vulnerability-agnostic coverage

The load-bearing design claim is that behavioral detection — monitoring process execution patterns for anomalies, not matching specific CVE signatures — produces coverage that exists before the specific vulnerability- tailored rule. Cloudflare's endpoints detected the full execution chain (script interpreter → kernel crypto subsystem → privilege-escalation binary) as malicious based on fleet-wide behavioral patterns, not on knowledge of CVE-2026-31431. The confirmation that detection coverage existed before any Copy-Fail-specific logic was written is the post's first high-confidence signal that the fleet was not compromised pre-disclosure.

bpf-lsm as the runtime-mitigation primitive

The bpf-lsm framework is Cloudflare's generic mechanism for live-patching security vulnerabilities without reboots. For Copy Fail the specific program hooks socket_bind:

  1. If socket_family != AF_ALG → allow (cheap, ~all traffic).
  2. If socket_family == AF_ALG → check calling binary's path against the allow-list.
  3. If binary is on the allow-list → allow the bind.
  4. Otherwise → deny with EPERM.

This pattern is reusable across any kernel CVE that can be gated at an LSM hook — the shape is allowlist-denial at an LSM hook. Contrast with whole-module removal, which is coarser and risks dependency breakage (as the 2026-04-29 staging attempt demonstrated).

Two-phase staged rollout: visibility first, then enforcement

The patterns/visibility-before-enforcement-rollout pattern separates measurement and enforcement into two independent deployment gates:

  • Phase 1 — visibility. Push an ebpf_exporter config (via salt, no kernel changes) that hooks socket() syscall and emits a Prometheus metric of AF_ALG socket creation per binary. Aggregate across hundreds of thousands of servers within hours. Confirm the expected one legitimate user is actually the only legitimate user.

  • Phase 2 — enforcement. Push the bpf-lsm program behind a separate gate. Enforcement takes effect only after phase 1 has validated the allow-list.

Each phase can be rolled back independently. Phase 1 has zero failure mode beyond metrics overhead. Phase 2's failure mode is bounded by the allow-list validated in phase 1.

Assume-compromise threat hunting

On every critical vulnerability, Cloudflare's security team starts from "assume compromise until you can prove otherwise". The retrospective search covers:

  1. Kernel logs for the exploit's distinctive signature, 48 hours back in time across the full fleet.
  2. Access logs for affected systems — who connected, when, what commands ran — giving a complete forensic picture of interactive activity.
  3. Binary integrity — cryptographic hashes of system binaries validated against known-good package manifests (the exploit taints the page cache, so on-disk hashes remain clean; the check is complementary to behavioral detection).
  4. Persistence mechanisms — audit for common post- exploitation persistence.
  5. Network connections — audit for unusual egress or lateral movement.

Load-bearing framing: all five pillars must come back clean before "no compromise" is asserted. The concepts/assume-compromise-posture page canonicalises this as a standing posture, not an exceptional one.

LTS backport latency gap

The post's honest self-assessment: "despite our practice of deploying Linux patch updates every two weeks, we remained vulnerable because a month-old mainline fix had yet to be backported to our primary kernel line." The upstream Linux fix for Copy Fail landed in mainline weeks before public disclosure, but had not yet been backported to the LTS 6.12 line that Cloudflare's majority fleet runs. The concepts/lts-kernel-backport-latency-gap page captures this as a structural property of the LTS-based fleet model: frequent upstream pulls don't shorten the mainline→LTS backport window. Runtime mitigations (patterns/bpf-lsm-allowlist-hook-denial) and attack-surface reduction (stated follow-up: "remove unused modules from the build entirely") are the complementary levers.

Edge Reboot Release (ERR) pipeline

The scheduled patching substrate: community LTS pulls → weekly internal kernel build → staging-datacenter validation → global rollout via ERR on a 4-week edge-reboot cycle. The control plane runs the newest available kernel with workload-aware reboot scheduling. For Copy Fail the patched 6.12 build resumed the normal ERR cadence on 2026-05-04; out-of-cycle reboots picked up the patched kernel for servers that had already passed through ERR earlier in the week.

Named systems (wiki instances)

Named concepts

Named patterns

Operational numbers

  • 5 days disclosure → fleet-fully-patched-or-mitigated (2026-04-29 → 2026-05-04).
  • Minutes to detection on internal validation (behavioral detection, no signature update).
  • 48-hour retroactive threat-hunt window across fleet-wide logs.
  • 4-week Edge Reboot Release cycle for normal kernel patch rollout.
  • Biweekly Linux patch cadence — didn't close the CVE-2026-31431 gap because LTS backport lagged mainline.
  • Hours to aggregate fleet-wide AF_ALG usage via ebpf_exporter (no kernel changes, salt-gated config push).
  • LTS versions in production at disclosure: majority on 6.12, subset on 6.18.

Caveats

  • Vendor-authored post; security-team narrative. No independent verification of the "no customer impact" claim. Post doesn't say whether any internal service used the unintended algif_aead path — only that the one known legitimate user was confirmed to be the sole user.
  • Exact bpf-lsm eBPF program source not published; the Copy Fail researchers' one-liner (python3 -c ...) is reproduced but Cloudflare's specific allow-list is not.
  • No fleet-size numbers disclosed in this post ("hundreds of thousands of servers" is generic).
  • Behavioral detection vendor / engine not named.
  • Canonical /usr/bin/su attack target disclosed publicly by Xint Code; Cloudflare's own fleet may have different setuid-root binaries that change the exploitation surface per-host.
  • "Better visibility into kernel-API dependencies" is stated as a follow-up, implying it was not fully in place when the first mitigation attempt (unconditional module removal) was designed — which is exactly what caught the dependency conflict in staging.
  • Post does not disclose how Cloudflare's custom LTS kernel build relates to the public LTS release schedule beyond "weekly internal build" / "4-week ERR".

Source

Last updated · 451 distilled / 1,324 read