Skip to content

CONCEPT Cited by 1 source

Content-use signaling

Content-use signaling is a protocol-level mechanism that lets site owners express how their content may be stored and reshared by automated visitors, extending beyond the binary allow/block model of traditional robots.txt. It builds on the Content Signals specification.

Definition

A content-use signal is a machine-readable preference — expressed in robots.txt or HTTP headers — that communicates the degree of reproduction a site owner permits. It exists on a spectrum from "interact but store nothing" to "fully reproduce."

Three Levels

Value Meaning Example
immediate Interact, but store and reuse nothing Chat fetch bots that read and discard
reference Index, excerpt, and link back Search engines that show snippets with attribution
full Summarize and reproduce Training crawlers that absorb full content

Protocol Expression

Content-use extends the Content Signals specification via robots.txt:

User-agent: *
Content-Signal: search=yes,ai-train=no,use=reference
Allow: /

The use field is the fourth addition to Content Signals (joining search, ai-train, and the base Allow/Disallow).

Enforcement Model

Content-use signals are preferences, not direct enforcement — similar to Disallow in robots.txt, which is advisory. However, Cloudflare adds teeth:

  • Bots tracked in systems/cloudflare-botbase have their content-use level tracked.
  • Bots that abuse declared signals lose Verified status.
  • Bots with content-use full cannot achieve Verified status at all.
  • Enterprise Bot Management customers can write rules combining category + content-use level (e.g., "allow Search + SEO + Ads Verification, but only up to reference").

Design Rationale

  • Traditional robots.txt is binary (crawl or don't). It doesn't express "you can index but don't reproduce in full."
  • The three levels map to actual use-cases: real-time agents need immediate, search engines need reference, training pipelines want full.
  • Composability with bot taxonomy: rules like "Agent + immediate = allowed, Agent + full = blocked" enable nuanced policy.

Seen In

Last updated · 564 distilled / 1,671 read