CLOUDFLARE 2025-07-01 Tier 1

Cloudflare: Introducing pay per crawl — Enabling content owners to charge AI crawlers for access¶

Summary¶

Cloudflare announces Pay Per Crawl (private beta, 2025-07-01), a framework that lets publishers monetize AI-crawler access to their content at internet scale by reviving the mostly-unused HTTP 402 Payment Required response code. For each request a site owner chooses one of three outcomes — Allow (free), Charge (flat per-request domain-wide price via 402), or Block (functional 403 that still advertises the relationship could exist) — enforced by a rules engine that runs after the site's WAF and bot-management layer. Anti-spoofing uses Cloudflare's Web Bot Auth proposal: crawlers generate an Ed25519 keypair, publish the JWK-formatted public key at a hosted directory, register the directory URL + user agent with Cloudflare, and sign every request with HTTP Message Signatures (RFC 9421) carrying signature-agent, signature-input, and signature headers. Two price-negotiation flows: reactive (crawler requests, gets 402 + crawler-price, retries with crawler-exact-price) and preemptive (crawler sends crawler-max-price up front — served with crawler-charged if the configured price ≤ max). Cloudflare acts as the Merchant of Record, aggregates billing events across all participating publishers and crawlers, charges crawlers, and distributes earnings. Framed as groundwork for a future agentic paywall where AI agents programmatically negotiate content access inside a spending budget.

Key takeaways¶

HTTP 402 as the negotiation primitive. Cloudflare picks the long-dormant 402 Payment Required status code (defined but effectively unused since HTTP/1.1) as the signal that a resource requires payment, rather than inventing a new code. Response carries a crawler-price: USD XX.XX header. Standard HTTP semantics mean existing clients, proxies, and log-collectors understand the response as a client error without the crawler having to intercept anything custom at the network layer. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
Anti-spoofing via Web Bot Auth, not IP allowlists. Because the crawler pays, Cloudflare must be certain which crawler is making each request. Allowlists of crawler IPs aren't good enough — anyone on the same egress IP could spoof. Web Bot Auth requires crawlers to (a) generate an Ed25519 keypair, (b) publish the JWK-formatted public key in a hosted directory, (c) register the directory URL and user-agent with Cloudflare, and (d) sign every HTTP request with RFC 9421 HTTP Message Signatures using that key. Cryptographic identity, not network identity. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
Three request headers carry the signature. Signed crawler requests include signature-agent (directory URL identifying the bot operator), signature-input (covered fields, keyid, algorithm ed25519, timestamp, expiry, nonce, tag web-bot-auth), and signature (the actual ed25519 signature bytes, base64url). Cloudflare resolves the signature-agent URL, fetches the JWK directory, picks the key by keyid, verifies. Tag web-bot-auth disambiguates this use of HTTP Message Signatures from other deployments. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
Two price-negotiation flows — reactive and preemptive. Reactive: crawler requests the resource blind → receives HTTP 402 Payment Required + crawler-price: USD XX.XX → retries with crawler-exact-price: USD XX.XX declaring willingness to pay that price. Preemptive: crawler includes crawler-max-price: USD XX.XX on the first request; if configured-price ≤ max-price, the server responds 200 OK with crawler-charged: USD XX.XX (the configured price, not the max). If configured price exceeds the max, server replies 402 with the configured crawler-price as usual. Only one of crawler-exact-price or crawler-max-price is allowed per request. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
Rules engine runs after WAF and bot-management. Allow / Charge / Block decisions are applied only after the site's existing WAF policies and bot-management / bot-blocking features have fired. Lets publishers keep their security posture unchanged — they layer monetization on top, not through it. Architecturally: bot auth lives in a later pipeline stage than access / threat controls. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
"Charge" without a billing relationship is a 403 with a future option. If a publisher selects Charge for a crawler that doesn't have a billing relationship with Cloudflare and therefore can't be charged, the effect is identical to 403 Forbidden (no content returned) — but the crawler is told pricing exists and a future relationship is possible. This is intentional: the crawler operator sees "you could access this for $X if you opted in" instead of a silent network-level block. Turns block-for-unknown-bots into a standing offer. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
Cloudflare is Merchant of Record. Billing events emit when a signed crawler request with payment intent receives an HTTP-200-family response with a crawler-charged header. Cloudflare aggregates the events, charges the crawler, and distributes earnings to publishers. Eliminates the bilateral-contract coordination problem — historically charging a crawler required "knowing the right individual and striking a one-off deal", insurmountable for small publishers. A single intermediary turns it into an N-to-M marketplace. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
Priced flat-per-request domain-wide initially, with per-crawler overrides. Publishers configure one flat price across their entire site; they can bypass the charge for specific crawlers (e.g. a pre-negotiated partnership, a free crawler they want to allow). Explicitly not supported at launch: per-path, per-content-type, or dynamic (demand-based) pricing — Cloudflare flags these as things they expect to evolve. Granular-licensing (training vs inference vs search) is future scope. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
Agentic paywall is the stated end-state. Cloudflare frames HTTP-402 as the substrate for a future where AI agents hit 402s, consult a user-granted budget, decide whether to pay, and retry with crawler-exact-price — all programmatically. "Imagine asking your favorite deep research program to help you synthesize the latest cancer research ... and then giving that agent a budget to spend to acquire the best and most relevant content." The pay-per-crawl protocol is deliberately designed to generalize from crawler-to-publisher to agent-to-resource. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)

Systems / concepts / patterns introduced¶

Systems: systems/pay-per-crawl, systems/web-bot-auth.
Concepts: concepts/http-402-payment-required, concepts/http-message-signatures, concepts/agentic-paywall, concepts/verified-bots.
Patterns: patterns/price-header-negotiation, patterns/signed-bot-request, patterns/merchant-of-record-aggregation.

Operational numbers¶

Ed25519 signatures on every request (smallest/fastest of the modern public-key signature schemes — ~64-byte signatures, constant-time).
Signature window: created … expires timestamps in signature-input (example in post uses a 1-hour window, 1735689600 → 1735693200).
Nonce: randomly generated per request, base64-url, anti-replay (example in post is 64 bytes).
Launch shape: flat per-request USD price domain-wide.

Caveats / gaps¶

Private beta at post time. No crawler counts, no publisher counts, no billing-event throughput, no latency overhead of the signature-verification path, no price-discovery data.
Crawler must publish a JWK directory — imposes a static hosting requirement on every participating bot operator. No fallback for crawlers that can't run a web directory.
Only one crawler-exact-price OR crawler-max-price per request — no way to send both, no way to express "I'll pay up to X for content A but up to Y for content B" in a single request.
Flat-price-only at launch. No per-path, per-content-type, per-rate-tier, or granular-license (training vs inference) pricing.
Payment settlement details undocumented. How crawler bills are settled (prepaid balance? invoice? card-on-file?), dispute resolution, refund mechanics, publisher minimum-payout thresholds all unspecified in the launch post.
Bot-auth identity revocation path unspecified. If a crawler's Ed25519 key leaks, the JWK directory presumably rotates, but the grace period / propagation / in-flight-signed-request handling aren't described.
No discussion of how non-Cloudflare-fronted sites participate. Cloudflare acts as the Merchant of Record, so participating sites appear to need a Cloudflare zone — origin-direct sites without Cloudflare in front cannot plug into the marketplace.

Source¶

Original: https://blog.cloudflare.com/introducing-pay-per-crawl/
Raw markdown: raw/cloudflare/2025-07-01-cloudflare-to-introduce-pay-per-crawl-for-ai-bots-8eb15448.md
HN: https://news.ycombinator.com/item?id=44432385 (569 points)

companies/cloudflare
sources/2025-03-25-cloudflare-opkssh-open-sourcing — another Cloudflare play applying public-key identity to a protocol that had none of its own (OPKSSH grafts OIDC public-key commitments onto SSH via certificate extensions; pay-per-crawl grafts cryptographic bot identity onto HTTP via RFC 9421 signatures).
concepts/fine-grained-billing — per-request billing granularity, cousin of AWS Lambda's 1ms billing evolution.