Cloudflare: Introducing pay per crawl — Enabling content owners to charge AI crawlers for access¶
Summary¶
Cloudflare announces Pay Per Crawl (private beta, 2025-07-01), a framework
that lets publishers monetize AI-crawler access to their content at
internet scale by reviving the mostly-unused
HTTP 402 Payment Required response code. For each request a site owner
chooses one of three outcomes — Allow (free), Charge (flat
per-request domain-wide price via 402), or Block (functional 403 that
still advertises the relationship could exist) — enforced by a rules
engine that runs after the site's WAF and bot-management layer.
Anti-spoofing uses Cloudflare's Web Bot Auth
proposal: crawlers generate an Ed25519 keypair, publish the JWK-formatted
public key at a hosted directory, register the directory URL + user
agent with Cloudflare, and sign every request with
HTTP Message Signatures (RFC 9421)
carrying signature-agent, signature-input, and signature headers.
Two price-negotiation flows: reactive (crawler requests, gets 402 +
crawler-price, retries with crawler-exact-price) and preemptive
(crawler sends crawler-max-price up front — served with
crawler-charged if the configured price ≤ max). Cloudflare acts as the
Merchant of Record, aggregates billing events across all
participating publishers and crawlers, charges crawlers, and distributes
earnings. Framed as groundwork for a future
agentic paywall where AI agents
programmatically negotiate content access inside a spending budget.
Key takeaways¶
-
HTTP 402 as the negotiation primitive. Cloudflare picks the long-dormant
402 Payment Requiredstatus code (defined but effectively unused since HTTP/1.1) as the signal that a resource requires payment, rather than inventing a new code. Response carries acrawler-price: USD XX.XXheader. Standard HTTP semantics mean existing clients, proxies, and log-collectors understand the response as a client error without the crawler having to intercept anything custom at the network layer. (Source: sources/2025-07-01-cloudflare-pay-per-crawl) -
Anti-spoofing via Web Bot Auth, not IP allowlists. Because the crawler pays, Cloudflare must be certain which crawler is making each request. Allowlists of crawler IPs aren't good enough — anyone on the same egress IP could spoof. Web Bot Auth requires crawlers to (a) generate an Ed25519 keypair, (b) publish the JWK-formatted public key in a hosted directory, (c) register the directory URL and user-agent with Cloudflare, and (d) sign every HTTP request with RFC 9421 HTTP Message Signatures using that key. Cryptographic identity, not network identity. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
-
Three request headers carry the signature. Signed crawler requests include
signature-agent(directory URL identifying the bot operator),signature-input(covered fields, keyid, algorithmed25519, timestamp, expiry, nonce, tagweb-bot-auth), andsignature(the actual ed25519 signature bytes, base64url). Cloudflare resolves thesignature-agentURL, fetches the JWK directory, picks the key bykeyid, verifies. Tagweb-bot-authdisambiguates this use of HTTP Message Signatures from other deployments. (Source: sources/2025-07-01-cloudflare-pay-per-crawl) -
Two price-negotiation flows — reactive and preemptive. Reactive: crawler requests the resource blind → receives
HTTP 402 Payment Required+crawler-price: USD XX.XX→ retries withcrawler-exact-price: USD XX.XXdeclaring willingness to pay that price. Preemptive: crawler includescrawler-max-price: USD XX.XXon the first request; ifconfigured-price ≤ max-price, the server responds200 OKwithcrawler-charged: USD XX.XX(the configured price, not the max). If configured price exceeds the max, server replies402with the configuredcrawler-priceas usual. Only one ofcrawler-exact-priceorcrawler-max-priceis allowed per request. (Source: sources/2025-07-01-cloudflare-pay-per-crawl) -
Rules engine runs after WAF and bot-management. Allow / Charge / Block decisions are applied only after the site's existing WAF policies and bot-management / bot-blocking features have fired. Lets publishers keep their security posture unchanged — they layer monetization on top, not through it. Architecturally: bot auth lives in a later pipeline stage than access / threat controls. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
-
"Charge" without a billing relationship is a 403 with a future option. If a publisher selects Charge for a crawler that doesn't have a billing relationship with Cloudflare and therefore can't be charged, the effect is identical to
403 Forbidden(no content returned) — but the crawler is told pricing exists and a future relationship is possible. This is intentional: the crawler operator sees "you could access this for $X if you opted in" instead of a silent network-level block. Turns block-for-unknown-bots into a standing offer. (Source: sources/2025-07-01-cloudflare-pay-per-crawl) -
Cloudflare is Merchant of Record. Billing events emit when a signed crawler request with payment intent receives an HTTP-200-family response with a
crawler-chargedheader. Cloudflare aggregates the events, charges the crawler, and distributes earnings to publishers. Eliminates the bilateral-contract coordination problem — historically charging a crawler required "knowing the right individual and striking a one-off deal", insurmountable for small publishers. A single intermediary turns it into an N-to-M marketplace. (Source: sources/2025-07-01-cloudflare-pay-per-crawl) -
Priced flat-per-request domain-wide initially, with per-crawler overrides. Publishers configure one flat price across their entire site; they can bypass the charge for specific crawlers (e.g. a pre-negotiated partnership, a free crawler they want to allow). Explicitly not supported at launch: per-path, per-content-type, or dynamic (demand-based) pricing — Cloudflare flags these as things they expect to evolve. Granular-licensing (training vs inference vs search) is future scope. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
-
Agentic paywall is the stated end-state. Cloudflare frames HTTP-402 as the substrate for a future where AI agents hit 402s, consult a user-granted budget, decide whether to pay, and retry with
crawler-exact-price— all programmatically. "Imagine asking your favorite deep research program to help you synthesize the latest cancer research ... and then giving that agent a budget to spend to acquire the best and most relevant content." The pay-per-crawl protocol is deliberately designed to generalize from crawler-to-publisher to agent-to-resource. (Source: sources/2025-07-01-cloudflare-pay-per-crawl)
Systems / concepts / patterns introduced¶
- Systems: systems/pay-per-crawl, systems/web-bot-auth.
- Concepts: concepts/http-402-payment-required, concepts/http-message-signatures, concepts/agentic-paywall, concepts/verified-bots.
- Patterns: patterns/price-header-negotiation, patterns/signed-bot-request, patterns/merchant-of-record-aggregation.
Operational numbers¶
- Ed25519 signatures on every request (smallest/fastest of the modern public-key signature schemes — ~64-byte signatures, constant-time).
- Signature window:
created…expirestimestamps insignature-input(example in post uses a 1-hour window, 1735689600 → 1735693200). - Nonce: randomly generated per request, base64-url, anti-replay (example in post is 64 bytes).
- Launch shape: flat per-request USD price domain-wide.
Caveats / gaps¶
- Private beta at post time. No crawler counts, no publisher counts, no billing-event throughput, no latency overhead of the signature-verification path, no price-discovery data.
- Crawler must publish a JWK directory — imposes a static hosting requirement on every participating bot operator. No fallback for crawlers that can't run a web directory.
- Only one
crawler-exact-priceORcrawler-max-priceper request — no way to send both, no way to express "I'll pay up to X for content A but up to Y for content B" in a single request. - Flat-price-only at launch. No per-path, per-content-type, per-rate-tier, or granular-license (training vs inference) pricing.
- Payment settlement details undocumented. How crawler bills are settled (prepaid balance? invoice? card-on-file?), dispute resolution, refund mechanics, publisher minimum-payout thresholds all unspecified in the launch post.
- Bot-auth identity revocation path unspecified. If a crawler's Ed25519 key leaks, the JWK directory presumably rotates, but the grace period / propagation / in-flight-signed-request handling aren't described.
- No discussion of how non-Cloudflare-fronted sites participate. Cloudflare acts as the Merchant of Record, so participating sites appear to need a Cloudflare zone — origin-direct sites without Cloudflare in front cannot plug into the marketplace.
Source¶
- Original: https://blog.cloudflare.com/introducing-pay-per-crawl/
- Raw markdown:
raw/cloudflare/2025-07-01-cloudflare-to-introduce-pay-per-crawl-for-ai-bots-8eb15448.md - HN: https://news.ycombinator.com/item?id=44432385 (569 points)
Related¶
- companies/cloudflare
- sources/2025-03-25-cloudflare-opkssh-open-sourcing — another Cloudflare play applying public-key identity to a protocol that had none of its own (OPKSSH grafts OIDC public-key commitments onto SSH via certificate extensions; pay-per-crawl grafts cryptographic bot identity onto HTTP via RFC 9421 signatures).
- concepts/fine-grained-billing — per-request billing granularity, cousin of AWS Lambda's 1ms billing evolution.