Skip to content

CONCEPT Cited by 3 sources

Verified Bots

Verified bots is the general problem of distinguishing legitimate automated clients (search crawlers, AI training crawlers, uptime monitors, archive spiders, RSS fetchers) from abusive or imposter clients that claim to be them. The answer determines whether an origin serves content, blocks, rate-limits, or — with Cloudflare's 2025-07-01 pay-per-crawl — bills.

Why "verified" is hard

  • User-agent strings are trivially forgeable. Googlebot/2.1 in a User-Agent header means nothing cryptographically.
  • IP allowlists drift. Cloud-egress IPs change without notice; NAT pools are shared; CDNs front multiple bots. IP-based identification hasn't been reliable in many years.
  • Reverse-DNS patterns (the classic Googlebot scheme — reverse-DNS the client IP, forward-DNS the result, compare) still rely on DNS trust chains and miss bots running on non-dedicated infrastructure.
  • API keys (bot shows X-API-Key: <secret>) are bearer tokens — leak once, impersonate forever; rotation is painful.
  • At pay-per-crawl scale, wrong-bot-charged is a financial error, not just a policy one.

The cryptographic answer

Modern verified-bots schemes replace "what IP / DNS / UA is this" with "prove you hold a private key whose public counterpart is published in a directory I can fetch." Concrete instance: Cloudflare's Web Bot Auth

  • Bot operator generates an Ed25519 keypair.
  • Publishes the public key at a JWK directory.
  • Registers the directory URL + user-agent with Cloudflare.
  • Signs every request with RFC 9421 HTTP Message Signatures.

No shared secret ever travels. Every request is independently verifiable against a public directory. Keys can rotate by publishing new JWKs at the same directory. Bot identity is effectively a decentralized PKI, with the directory URL as the trust anchor.

Generalizes beyond Cloudflare / Web Bot Auth

The pattern — per-request cryptographic proof of identity over a published public key — generalizes. The same primitive underlies:

  • OPKSSH (Cloudflare, 2025-03-25) — OIDC ID Token + public-key commitment = PK Token, attached to an SSH certificate extension to prove a user's identity-plus-keypair.
  • SSH-CA-backed access (BLESS, Smallstep) — short-lived certificate signed by a CA proves identity + holds key.
  • mTLS client certificates — X.509-based equivalent.
  • Web Bot Auth — RFC 9421-based equivalent, optimized for stateless HTTP request verification.

Same load-bearing invariant in all: patterns/identity-to-key-binding.

Seen in

Last updated · 200 distilled / 1,178 read