CONCEPT Cited by 1 source
Undeclared crawler¶
Definition¶
An undeclared crawler is any automated web client not listed in its operator's published crawler directory — no matching user-agent, source IP, or Web Bot Auth signature tying the traffic to a documented program.
Undeclared is an attribution statement, not an intent statement:
- Some undeclared crawlers are benign — new bots pre-public- announcement, internal fetchers never intended for general web crawling, unattributed third-party aggregators.
- Some undeclared crawlers are stealth crawlers — deliberately evading identity.
From an origin's perspective the two are indistinguishable without behavioral analysis, which is why ML-based fingerprinting is load-bearing: attribution failure is the default for anything outside the declared set.
Canonical instance¶
Perplexity's third crawler (Cloudflare, 2025-08-04) is both undeclared and stealth:
- Not named in Perplexity's bot documentation.
- IPs outside the published Perplexity range.
- Generic Chrome-on-macOS user-agent.
- 3-6 M requests/day cross-domain.
Seen in¶
- sources/2025-08-04-cloudflare-perplexity-stealth-undeclared-crawlers — canonical wiki instance.