CONCEPT Cited by 2 sources

BGP route withdrawal¶

BGP route withdrawal is the act of unadvertising a prefix from a BGP speaker — once withdrawn, transit and peer routers remove the prefix from their tables and stop forwarding traffic to it. On the public Internet the withdrawal propagates globally in the low-seconds-to-minutes range. For an anycast service that depends on advertising the same prefix from many POPs, a global withdrawal makes the service unreachable worldwide at Internet-routing speed.

Why it's the single dominant failure primitive for anycast services¶

Anycast's win is that the same IP is advertised from every POP, so every packet gets delivered to the nearest one. The symmetric risk is that any path that ends in "stop advertising this prefix everywhere" is a single-action global outage — no progressive degradation, no partial regions, no graceful failover unless engineered in explicitly. There is no inherent blast-radius containment between "the advertisement exists" and "the advertisement has been withdrawn"; the gradient is configured on top.

Two distinct causal classes produce identical wire-level behaviour:

External hijack — another AS starts advertising a more- specific or equally-specific route; transit providers prefer it, and the true origin's prefix is no longer on the best path. Cloudflare's 2024-06-27 1.1.1.1 incident was this class.
Self-inflicted withdrawal — the service owner's own control plane stops advertising the prefix. The 2025-07-14 Cloudflare 1.1.1.1 incident (see sources/2025-07-16-cloudflare-1111-incident-on-july-14-2025) was this class: a service-topology config change caused the control plane to shrink the Resolver's topology from "all locations" to "one offline location", so all POPs withdrew.

From the Internet's perspective these look similar during the event: the prefix disappears from the global routing table. Cloudflare Radar even flagged Tata Communications' 21:54 UTC advertisement of 1.1.1.0/24 during the 2025-07-14 outage as a hijack — which it was, but it was not what caused the outage; it only became visible because Cloudflare's legitimate advertisement had been pulled.

Restoring a withdrawn prefix¶

Re-advertising the BGP prefix is near-instant in the routing-protocol sense: peer routers see the announcement, propagate it, and traffic returns. But the service-level recovery can be much slower:

Server-side IP bindings may have been removed as a side effect of the control-plane change that withdrew the prefix. In the 2025-07-14 incident, ~23% of Cloudflare's edge fleet had to have IP bindings re-added via the change-management system — "which is not an instantaneous process by default for safety" — normally taking multiple hours of progressive rollout.
DNS caches and connection state at clients need to recover on their own timelines; a re-announced prefix doesn't rebind in-flight client state.

So patterns/fast-rollback on BGP is two-phase: the re-announcement is fast; the fleet-side plumbing is slow-by- design and has to be accelerated case-by-case during incidents.

Generalises to any anycast control plane¶

The lesson extends beyond Cloudflare:

Any CDN / DNS / DDoS / cloud-edge provider announcing service IPs via anycast from a global fleet has this same single-action global-outage surface.
Configuration surfaces that decide where a prefix is advertised from are as safety-critical as the code that serves traffic — and arguably more so, since a bad topology change can be effectively instantaneous in impact where a bad code push still has to propagate through a deployment pipeline.
The standard mitigation is patterns/progressive-configuration-rollout — gate advertisement changes behind canary / staged deployment with automated rollback on health-signal regression, so a bad topology change fails closed at the canary, not at the fleet.

Seen in¶

sources/2025-07-16-cloudflare-1111-incident-on-july-14-2025 — canonical wiki instance of self-inflicted anycast withdrawal: global BGP withdrawal of all 11 Resolver prefixes produced 62 minutes of customer impact; revert to re-announce was near-instant in BGP terms but the full service restoration required 34 additional minutes of accelerated server-binding restore.
sources/2026-04-28-cloudflare-q1-2026-internet-disruption-summary — BGP withdrawal as disambiguation surface for government-directed Internet shutdowns. Iran's February 28 2026 shutdown had IPv4 + IPv6 BGP announcements stay largely consistent — signalling that the shutdown was implemented via aggressive filtering rather than route withdrawal. Contrast with Iran's January 8 2026 event: a coordinated IPv6 withdrawal across AS43754 + AS31549 preceded the traffic drop as a likely leading indicator. When the routes stay up and the traffic drops, the mechanism is filtering; when the routes disappear, the mechanism is withdrawal. Observability of the routing table is the tool that discriminates.

concepts/anycast — the reach-multiplier that makes withdrawal a global event
concepts/service-topology — the usual configuration surface where "where is this advertised?" is decided
concepts/government-directed-internet-shutdown — where route withdrawal is one of the possible shutdown mechanisms, distinguishable from filtering via routing-table observation
concepts/filtering-based-shutdown — the non-withdrawal mechanism, observationally distinguishable from route withdrawal
concepts/ipv6-withdrawal-as-shutdown-signal — selective IPv6 retraction as leading indicator of a filtering-based shutdown
systems/cloudflare-radar — the observability substrate
systems/cloudflare-1-1-1-1-resolver
patterns/progressive-configuration-rollout
patterns/fast-rollback

BGP route withdrawal¶

Why it's the single dominant failure primitive for anycast services¶

Restoring a withdrawn prefix¶

Generalises to any anycast control plane¶

Seen in¶

Related¶