Skip to content

CLOUDFLARE 2025-06-20 Tier 1

Read original ↗

Defending the Internet: how Cloudflare blocked a monumental 7.3 Tbps DDoS attack

Summary

Cloudflare recounts autonomously blocking a 7.3 Tbps / 4.8 Bpps DDoS attack — the largest ever reported — against a hosting- provider customer using Magic Transit. The attack delivered 37.4 TB in 45 seconds (12% larger than their prior record, 1 Tbps over KrebsOnSecurity's recent near-record) and was 99.996% UDP flood plus a long tail of classic reflection/ amplification vectors (QOTD, Echo, NTP, Portmap, RIPv1) and Mirai UDP-flood traffic. The attack was detected and mitigated fully autonomously in 477 data centers across 293 locations, without human intervention or alerts — every Cloudflare server runs every service, so DDoS mitigation happens where the traffic lands. The detection/mitigation pipeline: global anycast routes the flood to the closest POP; XDP/eBPF samples packets from the Linux kernel hot path; a heuristic engine called dosd (denial-of-service daemon) analyses samples in user space, generates candidate fingerprint permutations, counts sample hits per fingerprint with a data-streaming algorithm, and compiles the winning fingerprint as an eBPF program that drops packets in XDP when thresholds trip. Each server's top fingerprints are gossiped (multicast) within the data center and globally so peers get the intelligence for free. The post is also a how-to-not- become-a-reflector enumeration for seven UDP vectors and a plug for Cloudflare's free DDoS Botnet Threat Feed (600+ orgs enrolled).

Key takeaways

  1. Anycast turns the attack's distributed nature against it. The victim IP was advertised via global anycast, so the ~7.3 Tbps flood from 122,145 source IPs in 5,433 ASes across 161 countries was routed to the closest Cloudflare POP for each source packet rather than converging on one scrubbing centre — "detected and mitigated in 477 data centers across 293 locations." The attacker's geographic spread becomes Cloudflare's per-POP-capacity advantage. (Canonical instance of concepts/anycast as a volumetric-DDoS-defence primitive.)
  2. Every Cloudflare server runs every service. DDoS detection and mitigation are not a central scrubbing tier — they are code that ships to every edge server on the fleet. "This means that attacks can be detected and mitigated fully autonomously, regardless of where they originate from." The fully autonomous response — no human in the loop, no alert fired, no incident declared — is a direct consequence of this architecture. (Canonical instance of patterns/autonomous-distributed-mitigation.)
  3. XDP + eBPF is the hot path, dosd is the control plane. On packet arrival, Cloudflare samples packets inside the Linux kernel from eXpress Data Path (XDP) using an eBPF program, routes samples to user space where dosd (denial-of-service daemon) looks for packet-header commonalities / anomalies / proprietary patterns, generates multiple fingerprint permutations, and uses a data-streaming algorithm to bubble up the permutation with the most hits. The mitigation rule is then compiled back down as an eBPF program to drop matching packets in XDP at line rate. (Second wiki instance of eBPF as a production data-plane, alongside Datadog Workload Protection; first wiki instance of XDP for DDoS.)
  4. Fingerprint permutation search is the accuracy lever. The naive fingerprint (most common header value) risks false positives; dosd instead enumerates permutations of candidate fingerprints and counts sample hits per permutation, so the winner is the most selective match ("try and surgically match against attack traffic without impacting legitimate traffic"). Activation thresholds gate compilation to avoid triggering on benign bursts; the rule auto-expires when the attack ends.
  5. Gossip multicast spreads the fingerprint. Each server's top fingerprint permutations are gossiped (multicast) within the data centre and globally, so peer servers get the mitigation state without re-deriving it from their own (possibly smaller) local sample. (Canonical instance of patterns/gossip-fingerprint-propagation — P2P dissemination of threat intelligence across an edge fleet; sibling of the control-plane-pushes-rules shape that would otherwise centralise this at the cost of a global SPOF.)
  6. 99.996% UDP flood + a long tail of reflection/amplification vectors. The 7.3 Tbps was almost all UDP flood, with the remaining 1.3 GB split across QOTD (UDP/17), Echo (UDP/TCP/7), NTP monlist (UDP/123), Mirai UDP flood, Portmap (UDP/111), RIPv1 (UDP/520), each an exploitation of a decades-old diagnostic or routing protocol whose amplification factor is known. Carpet-bombed an average of 21,925 destination ports on a single IP (peak 34,517/sec). The defender's counter is the same for every vector — deploy cloud-based volumetric protection, rate-limit or drop the specific UDP port — but the operator's don't become the reflector advice varies per vector and lands in concepts/udp-reflection-amplification.
  7. Scale shape. 37.4 TB in 45 seconds ≈ 9,350 HD movies / 7,480 hours of HD video / 12.5 M smartphone photos / 9.35 M songs — in 45 seconds. Traffic originated from 122,145 source IPs across 5,433 Autonomous Systems in 161 countries; top source ASes were Telefonica Brazil (10.5%), Viettel Vietnam (9.8%), China Unicom (3.9%), Chunghwa Telecom (2.9%), China Telecom (2.8%), Claro NXT / VNPT / UFINET Panama / STC / FPT Telecom each 1.3-1.8%. Avg unique source IPs/sec 26,855, peak 45,097. Attack targeted a Magic Transit hosting-provider customer — hosting and Internet infrastructure are increasingly targeted per Cloudflare's Q1 2025 DDoS threat report.
  8. Botnet threat-feed distribution. Cloudflare's free DDoS Botnet Threat Feed for Service Providers gives hosting / cloud / ISP operators a per-ASN list of source IPs they see launching HTTP DDoS — 600+ organizations enrolled, free, ASN auth via PeeringDB. The distribution posture mirrors the detection posture: push the intelligence out to the actors who can act on it (the origin networks), rather than centralising mitigation in the victim layer.

Architecture / numbers

Attack traffic shape - Peak bandwidth: 7.3 Tbps (12% larger than prior Cloudflare record, 1 Tbps above Krebs' 6.3 Tbps) - Peak packet rate: 4.8 Bpps (prior Q1 2025 record) - Volume delivered: 37.4 TB - Duration: 45 seconds - Destination ports: 21,925 avg / 34,517 peak per second (carpet bombed on a single customer IP) - Source ports: similar distribution - Vector breakdown: 99.996% UDP flood; 0.004% = 1.3 GB split across QOTD / Echo / NTP monlist / Mirai UDP / Portmap / RIPv1 - Source IPs: 122,145 unique total, 26,855 avg/sec, 45,097 peak/sec - Source ASes: 5,433 ; Telefonica BR 10.5%, Viettel VN 9.8%, China Unicom 3.9%, Chunghwa Telecom 2.9%, China Telecom 2.8%, top-10 ASNs each 1.3-1.8% for the tail - Source countries: 161 ; Brazil + Vietnam together ≈50%; tail Taiwan / China / Indonesia / Ukraine / Ecuador / Thailand / US / Saudi Arabia together ≈33%

Defence topology - Customer product: Magic Transit (IP-network-level DDoS protection; anycast-advertised customer prefixes; L3 scrubbing) - Anycast POPs involved: 477 data centers across 293 locations (high-traffic locations have multiple data centers) - Detection / mitigation location: on-server, every server, no central scrubbing tier - Human intervention: zero (no alerts fired, no incident opened)

Software pipeline (per server) - Packet sampling: XDP program (eBPF) pulls samples from the kernel hot path - Analysis: dosd (denial-of-service daemon) in user space — heuristic engine that detects packet-header commonalities + anomalies + proprietary patterns - Fingerprint search: generate multiple permutations of candidate fingerprints, count sample hits per permutation using a data- streaming algorithm, pick the best (highest mitigation efficacy and accuracy) - Activation: per-fingerprint hit threshold must be exceeded (false-positive guard); once tripped, fingerprint compiled to an eBPF program that drops matching packets in XDP - Sharing: each server gossips (multicasts) top fingerprints within the data centre and globally - Auto-expiry: rule times out and is removed once the attack ends - Customer-facing abstraction: the fingerprinting system is exposed as DDoS Protection Managed Rulesets - Load-balancing substrate: Unimog (Cloudflare's edge L4 load balancer) routes packets to an available server on arrival

Reflection/amplification vector table (attacker → defender)

Vector Port Type Attacker abuses Operator fix (don't reflect) Defender fix
QOTD UDP/17 Reflection+Amp short-quote response disable service, block UDP/17 drop UDP/17 inbound
Echo UDP+TCP/7 Reflection+Amp same-data echo disable service, block port 7 drop port 7 inbound
NTP monlist UDP/123 Reflection+Amp monlist returns recent-connection list disable monlist, restrict NTP to trusted rate-limit UDP/123
Mirai UDP Flood (botnet) IoT-device hijack secure IoT: change defaults, firmware up cloud volumetric + rate-limit UDP
Portmap UDP/111 Reflection+Amp RPC service enumeration disable Portmapper if not needed ACLs on UDP/111, block inbound
RIPv1 UDP/520 Reflection+(Low)Amp unauth routing-updates move to RIPv2 + auth or disable RIPv1 block UDP/520 from untrusted
UDP flood any Flood saturate link / appliance N/A cloud volumetric + smart rate-limit

Caveats / gaps

  • No POP-level capacity breakdown. The post says 477 data centres absorbed the attack, but doesn't disclose per-POP traffic share — the "anycast spreads it" claim is not quantified with traffic percentages by POP.
  • No dosd latency numbers. The sample → permutation search → activation → eBPF compile → drop loop's end-to-end latency (how many milliseconds from attack onset until packets start dropping in XDP) is not stated. Given the 45-second attack duration and the "no incidents caused" claim, the loop clearly closes quickly, but the number is not published.
  • Sample rate unspecified. XDP sampling ratio (1-in-N packets) is not disclosed. At 7.3 Tbps / 4.8 Bpps, even a low sample rate generates substantial per-server sample volume; the engineering choice of rate vs accuracy vs CPU is not discussed.
  • Gossip protocol details unspecified. "Gossips (multicasts) top fingerprint permutations" is the operative phrase; the post doesn't specify the underlying mechanism (IP multicast? pubsub fabric? custom gossip?), the cadence, or the convergence-time / blast-radius-of-a-bad-fingerprint trade-off. Treated as a named but not-internally-documented edge of the system.
  • Data-streaming-algorithm family unnamed. "Using a data streaming algorithm, we bubble up the fingerprint with the most hits" — likely Count-Min Sketch / Space-Saving / similar heavy-hitters algorithm, but not stated in post.
  • dosd is only partially named — this post is the first public disclosure of the name in a wiki-searchable form, but there is no open-source dosd to cite; treated as an internal Cloudflare system.
  • Legitimate-traffic impact during mitigation is claimed, not measured. "Without causing any incidents" — no customer-side error-rate / latency deltas for the victim hosting-provider during the 45-second window are shared.

Source

Last updated · 200 distilled / 1,178 read