Skip to content

CONCEPT Cited by 1 source

Undeclared crawler

Definition

An undeclared crawler is any automated web client not listed in its operator's published crawler directory — no matching user-agent, source IP, or Web Bot Auth signature tying the traffic to a documented program.

Undeclared is an attribution statement, not an intent statement:

  • Some undeclared crawlers are benign — new bots pre-public- announcement, internal fetchers never intended for general web crawling, unattributed third-party aggregators.
  • Some undeclared crawlers are stealth crawlers — deliberately evading identity.

From an origin's perspective the two are indistinguishable without behavioral analysis, which is why ML-based fingerprinting is load-bearing: attribution failure is the default for anything outside the declared set.

Canonical instance

Perplexity's third crawler (Cloudflare, 2025-08-04) is both undeclared and stealth:

  • Not named in Perplexity's bot documentation.
  • IPs outside the published Perplexity range.
  • Generic Chrome-on-macOS user-agent.
  • 3-6 M requests/day cross-domain.

Seen in

Last updated · 200 distilled / 1,178 read