CONCEPT Cited by 1 source
Proxy-node correlation signal¶
Definition¶
A proxy-node correlation signal fires when the same session identifier — typically a browser-telemetry fingerprint — appears from multiple proxy-node IP addresses within a window inconsistent with organic human behaviour. Individually, neither the fingerprint (which may look legitimate) nor the IP (which may be a legitimate proxy user) is decisive. The correlation across both is the signal.
Canonical wiki instance¶
From Vercel's 2026-04-21 BotID Deep Analysis post:
"These browser sessions that had initially appeared legitimate started showing up across a range of IP addresses. Crucially, these IPs were identified as proxy nodes, rather than network origin points. This was the tell-tale sign: legitimate users don't rapidly cycle through proxy networks while maintaining the same browser profile."
And:
"The key insight wasn't any single red flag. It was the correlation: identical browser fingerprints cycling through proxy infrastructure."
The signal is explicitly the intersection of two features:
- A stable browser-telemetry fingerprint across sessions.
- The session source IPs being identified as proxy nodes (not origin ISPs).
A real user with one browser + one proxy — common — doesn't trip the signal. A bot operator with N browser-automation instances behind M proxy nodes does, because the operator cannot maintain uniqueness of automation fingerprint while sourcing from a diverse-enough IP pool.
Why it defeats per-session stealth¶
Sophisticated bot operators invest heavily in making each individual session look legitimate:
- Browser telemetry is crafted to mimic real devices.
- Behavioural patterns simulate human interaction timing.
- Proxy rotation hides the true origin IP.
User-Agentand other headers look standard.
Each signal viewed alone is inconclusive. But the operator has to choose:
- Keep the fingerprint stable across the fleet (so automation can reuse a vetted template) and cycle IPs through proxies — exposes the proxy-correlation signal.
- Or rotate fingerprints per-session (so each browser looks unique from the IP's perspective) — explodes infrastructure cost, may introduce per-instance fingerprint defects that telemetry classifiers catch.
The two pressures are in tension — the defender wins by detecting whichever one the operator concedes.
Signal construction¶
Implementing the signal requires:
- A stable cross-session key. Typically a hash of the browser-telemetry fingerprint, normalised to ignore session-random nonces.
- Proxy-node classification of source IPs. Some IPs are residential ISPs with one user-household at a time; some are proxy-providers (commercial residential-proxy networks, mobile-carrier CGNAT, datacenter proxies). A bot-management vendor maintains the proxy-vs-origin classifier as a side-input.
- A time/cardinality threshold. A single fingerprint appearing on, say, 5+ proxy-node IPs within 10 minutes is anomalous; the exact threshold is vendor-tuned.
- An action. In Vercel / Kasada's case the action is forced re-verification (patterns/correlation-triggered-reverification), not an immediate block — because the signal still isn't strong enough to avoid FPs on edge-case legitimate users.
Proxy-node vs origin-IP classification¶
The post asserts "these IPs were identified as proxy nodes, rather than network origin points" — implying Kasada maintains a proxy-vs-origin classifier independent of the bot-vs-human classifier. This is plausible architecture — IP reputation services are a standard component in the bot-management product space — but the post doesn't disclose the classifier's inputs or cadence.
Features likely used (not disclosed, inferred):
- ASN characteristics — residential-proxy ASNs are known.
- Observed traffic profile — proxy nodes carry mixed customer traffic with distinctive port/protocol patterns.
- Commercial proxy-provider IP list matches.
- CGNAT / mobile-carrier range annotations.
Complementary / competing signals¶
- concepts/ml-bot-fingerprinting — TLS / HTTP-level fingerprinting. Operates on a different feature space; fires without needing cross-session correlation.
- concepts/composite-fingerprint-signal — the general framework of combining individually-weak signals.
- concepts/browser-telemetry-fingerprint — the stable session key that makes cross-IP correlation meaningful.
- concepts/coordinated-bot-network — the target phenomenon.
Limitations¶
- Requires correlation window. A fleet that operates in parallel for <60 seconds before dispersing may not trip the threshold.
- Operator-adaptable. A sophisticated operator can fingerprint-rotate per-session to avoid the cross-session key; the trade-off is tool-complexity and per-instance defect risk.
- Ambiguous for mobile. Mobile-carrier CGNAT legitimately co-locates many real users behind the same IP, and their fingerprints can collide. The thresholds need to be tuned to avoid FPs on mobile.
- Privacy surface. Maintaining a proxy-classification list of IP addresses is itself a user-surveillance primitive.
Seen in¶
- sources/2026-04-21-vercel-botid-deep-analysis-catches-a-sophisticated-bot-network-in-real-time — canonical wiki instance. The proxy-node correlation is the named "tell-tale sign" that triggered Deep Analysis' reclassification of a 40-45-profile coordinated bot fleet.
Related¶
- concepts/browser-telemetry-fingerprint — the stable session key.
- concepts/coordinated-bot-network — the phenomenon this signal detects.
- concepts/adaptive-bot-reclassification — the reclassification-after-correlation behaviour.
- concepts/composite-fingerprint-signal / concepts/ml-bot-fingerprinting / concepts/bot-vs-human-frame.
- patterns/correlation-triggered-reverification — the response pattern that this signal triggers.
- systems/vercel-botid-deep-analysis — the canonical system.