Skip to content

CONCEPT Cited by 6 sources

Anycast

Anycast is a network-layer routing discipline in which the same IP address (or prefix) is advertised from multiple geographically dispersed points of presence (POPs) via BGP, and the global routing fabric naturally delivers each source's packets to the topologically nearest advertising POP. From the sender's perspective there is one destination IP; from the fleet operator's perspective there are N servers / N POPs, and which one receives a given packet is a function of BGP best-path selection along the packet's path through the Internet.

Anycast is the load-bearing primitive of most modern global CDN / edge / DNS / DDoS services:

  • Anycast DNS: resolvers like 1.1.1.1, 8.8.8.8, and most authoritative roots run anycast so queries land at the nearest replica.
  • Anycast CDN: Cloudflare, Google's edge, Fastly advertise the same customer IP from every POP; HTTP requests land close to the client.
  • Anycast DDoS scrubbing: a victim prefix advertised anycast distributes an incoming flood across all POPs instead of concentrating it on one scrubbing site.

Why it wins for DDoS defence

This is the canonical instance discussed on the wiki.

In a flood attack, traffic comes from many sources (botnet nodes, reflectors). If the target were a single IP advertised from a single site, the sum of all attacker bandwidth converges on that site and the site's link capacity is the ceiling.

Under anycast, each attacker's packets are routed to the attacker's nearest POP, not to the victim's. A botnet with nodes spread across 161 countries (as in Cloudflare's 7.3 Tbps writeup) is automatically load-balanced across the fleet's per-POP capacity. The attack's distribution works for the defender:

  • 7.3 Tbps / 4.8 Bpps flood was "detected and mitigated in 477 data centers across 293 locations".
  • No data centre sees the full 7.3 Tbps; each sees its geographic share.
  • Capacity-planning benchmark flips from "worst-case aggregate flood" (what a central scrubbing site must absorb) to "worst- case per-POP flood from nearest attacker cluster" (much smaller).

Pairs with patterns/autonomous-distributed-mitigation: because every POP runs the same detection / mitigation stack, anycast's distribution works end-to-end — the mitigation happens at the POP that received the attack traffic, not at a central service.

Why it works for CDN / latency

Orthogonal but related: the same mechanism that spreads attack traffic also spreads legitimate traffic. The average user's RTT to an anycast edge is dominated by the geographic distance to the nearest POP, which is low when the fleet is dense. Cache-hit behaviour is then a per-POP concern, not a per-fleet concern.

Failure modes

  • BGP path flaps — route changes mid-session can move a client's packets from POP A to POP B, breaking any per-POP connection state (TCP sessions, TLS keys, L4 load-balancer state). Stateful protocols over anycast need either very stable BGP (common in practice) or explicit session migration. Cloudflare's edge load balancer (Unimog) sits in this gap.
  • Suboptimal BGP selections"topologically nearest" is not "geographically nearest"; peering / transit policy can route a client thousands of km out of the way. Anycast-CDN performance is partly a peering-engineering problem, not just a POP-density problem.
  • Per-POP capacity ceiling — the aggregate fleet can absorb a lot, but a well-geolocated attacker with nodes clustered near one POP can still overwhelm that POP's link capacity. This is why DDoS vendors advertise total network capacity (not per-POP capacity) and invest in densifying hot regions.
  • State coordination — any globally-consistent view of per-flow state requires cross-POP coordination (gossip, central control plane, or accept some consistency lag). DDoS fingerprint gossip (patterns/gossip-fingerprint-propagation) is an example of engineering around this.

Seen in

  • sources/2025-06-20-cloudflare-how-cloudflare-blocked-a-monumental-7-3-tbps-ddos-attack — 7.3 Tbps attack spread across 477 data centres / 293 locations by anycast routing of the victim's Magic-Transit-advertised IP; the single largest disclosed instance of anycast-as-DDoS-defence on the wiki. Full-autonomous mitigation runs on each receiving POP's own servers (no central scrubbing tier).
  • sources/2025-07-16-cloudflare-1111-incident-on-july-14-2025 — the symmetric failure-mode instance: because the systems/cloudflare-1-1-1-1-resolver|1.1.1.1 Resolver's prefixes are advertised via anycast from every POP, a service-topology misconfig that triggered a global BGP withdrawal produced a 62-minute worldwide outage at Internet-routing speed. Anycast multiplies reach in both directions — advertisement wins latency + DDoS, withdrawal loses the service globally in one step.
  • sources/2026-04-17-cloudflare-agents-week-network-performance-updatelatency-win instance. Anycast is the substrate that makes new PoP deployments (Wroclaw, Malang, Constantine) automatically serve their nearby user populations without client-side config changes: the same IPs are advertised from the new PoP, BGP routes the nearest users to it, connection time drops (Wroclaw free-tier 19 → 12 ms, −40 %). Anycast is implicit in the PoP densification playbook.
  • sources/2024-08-15-flyio-were-cutting-l40s-prices-in-halfinference-locality instance. Fly.io names Anycast as the network-locality axis of its inference compute-storage-network locality thesis: GPU compute + Tigris object storage + "an Anycast network that's fast everywhere in the world" combined on one platform. Anycast's role here is not DDoS defence or CDN static asset caching — it is transaction-shaped inference routing, landing the HTTP request at the nearest GPU-hosting POP where the model weights are already regionally cached.
  • sources/2025-02-26-flyio-taming-a-voracious-rust-proxyfly-proxy-as-Anycast-router instance. Fly.io states the edge-router / Anycast pairing explicitly: "Edges exist almost solely to run a Rust program called fly-proxy, the router at the heart of our Anycast network." The 2025-02 incident is a reminder that Anycast + per-edge-host pegging interact non-obviously — because packets land on the topologically-nearest POP, the CPU-busy edge sees only a geographic slice of the incoming Tigris load test, which is why the symptom was localised to two IAD hosts rather than fleet-wide. Anycast's load-distribution works the same way for production bugs as for attacks.
  • sources/2025-05-28-flyio-parking-lot-ffffffffffffffffAnycast-as-globally-correlated-blast-radius instance. Fly.io's 2024 global Anycast deadlock is the canonical wiki instance of Anycast's dark side: because routing state is in a global broadcast domain (every proxy receives every update), a routing-state-derived software bug (an if let read-lock-over-both-arms bug) can produce fleet-wide-correlated failure at the speed of intra-Corrosion gossip propagation ("millisecond intervals of time"). The motivating force behind Fly.io's regionalization effort is to shrink this broadcast domain so most routing updates stay within the region (Sydney, Frankfurt, Dallas) rather than hitting every proxy worldwide. "We call this effort 'regionalization', because the next intermediate goal is to confine most updates to the region in which they occur." Complements the Cloudflare BGP-withdrawal instances on the routing-protocol side — Fly's instance shows the same global-correlation risk inside a custom routing protocol (Corrosion).
Last updated · 200 distilled / 1,178 read