PATTERN Cited by 1 source
Behavior-based bot classification¶
Classify automated traffic by what the bot does on your site (observable behavior) rather than by operator identity, user-agent string, or a binary "AI/not-AI" label. Assign each bot one or more behavioral categories, then let site owners set policy at the category level.
Problem¶
The AI landscape evolves faster than static bot lists. A binary "block AI bots" control conflates indexing for search (which drives referrals back) with model training (which absorbs content permanently). Multi-purpose crawlers force site owners into all-or-nothing decisions.
Solution¶
- Define a finite taxonomy of behaviors (Search, Agent, Training, Transact, etc.).
- Classify each known bot with one or more behaviors โ multi-label, not single-label.
- Expose per-category toggles to site owners (allow/block per behavior).
- When a crawler carries multiple labels, enforce the most restrictive applicable rule.
- Incentivize operators to separate crawlers by purpose โ single-purpose crawlers get cleaner access.
Consequences¶
- Site owners get granular control without maintaining per-bot rules.
- Operators face incentive to separate Search from Training crawlers for better access.
- Future-proof: new AI use-cases map to existing or new behavior categories without rearchitecting the control surface.
- Over-blocking risk: multi-purpose crawlers that combine Search with Training may lose Search access for customers who block Training.
Known Uses¶
- systems/cloudflare-botbase โ implements the 10-category taxonomy with multi-label classification (Source: sources/2026-07-01-cloudflare-ai-traffic-options)