PATTERN Cited by 1 source
Comparative RUM benchmarking¶
Problem¶
You want to publish a credible, network-fair comparison of your CDN / edge / network product against competitors — not a marketing graph, but a methodologically defensible one — across a large denominator of networks. Synthetic probes are track-conditions, not road-conditions; marketing labs cherry-pick. You need apples-to-apples measurements from real user networks, with every tested provider sampled from the same client in the same network at the same moment.
Pattern¶
Use the user's browser as the probe, running a Real User Measurement JS test that exchanges small payloads with every provider in the comparison (not just your own), then aggregate with a robust statistic (trimean) to produce a per-network ranking.
Key shape choices:
- Probe runs on a non-critical page. Cloudflare runs it on Cloudflare-branded error pages — so the user isn't waiting on a successful request and the probe never delays the user's real workflow.
- All providers measured from one client moment. The browser fetches small files from Cloudflare, Amazon CloudFront, Google, Fastly, Akamai in parallel; the comparison is same browser, same WiFi, same ISP, same second. Removes the classic benchmark-bias sources.
- Latency metric is handshake-centric. Connection time — DNS + TCP/QUIC + TLS handshake — is chosen over throughput / TTFB / page load because it's what a CDN can most directly control and what most approximates perceived "Internet speed."
- Denominator is population-weighted. The APNIC top-1,000 networks by estimated population — not the top-1,000 by traffic, not an arbitrary country list. Ensures wins / losses are weighted by real human reach.
- Ranking metric is robust. Trimean of per-sample connection time per (provider, network, day). Then count the networks where provider X has the lowest trimean.
- Report headline + gap. Two numbers — "X % of networks where we win" + "average ms gap to next-fastest". Cloudflare's Dec-2025 numbers: 60 % fastest + 6 ms average gap.
Anatomy / pipeline¶
[ Cloudflare error page load ]
│
▼ (silent JS speedtest)
[ Browser fetches small files from
CF + CloudFront + Google + Fastly + Akamai ]
│ (records per-exchange duration = connection time)
▼
[ Beacon telemetry to Cloudflare ]
│
▼
[ Aggregation: trimean per (provider, network, day) ]
│
▼
[ Per-network ranking: count networks where each
provider has the lowest trimean ]
│
▼
[ Publish headline % + average gap in ms ]
Why it works¶
- Removes geographic cherry-picking. Every provider is measured on every client — you can't pick a favourable benchmark location because the benchmark location is wherever each real user happens to be.
- Real last-mile + peering conditions. The measurement includes whatever is between the user and each provider (ISP, peering, transit, BGP choices) — which is exactly the thing CDN buyers care about.
- Volume at zero marginal cost. Compared to a synthetic program with probes in every ASN, RUM gets coverage for free.
- Reproducible by third parties. Anyone with RUM instrumentation and enough traffic can compute the same rankings; the APNIC population denominator is public.
Failure modes / caveats¶
- Error-page cohort bias. Cloudflare's probe runs on error pages, so the sample population is users whose requests failed. Not identically distributed with the median successful-request user; trade-off is non-intrusiveness on the success path vs. a potentially skewed cohort. See concepts/benchmark-methodology-bias.
- Connection-state bias. If the browser has an open HTTP/2 or HTTP/3 session with one of the providers already, its "connection time" is effectively zero for this sample — artificially advantaging it. Defensible implementations force a cold connection per sample.
- Trimean hides tails. Robust ranking of median user experience but silent on p95 / p99 tail-latency problems; see concepts/tail-latency-at-scale for why those matter. Don't claim "fastest" without also publishing tail numbers.
- Methodology is under your control. The publishing provider chose the metric, the denominator, the aggregation, and the sampling site. A competitor can run the same pattern and reach a different headline by changing any of those knobs. "Cloudflare is fastest on RUM-trimean-connection-time on the APNIC-1000" is a specific, defensible claim — but not the only possible ranking.
- Competitor exclusion. Only the providers your RUM script fetches from are in the comparison. Regional / specialised CDNs that might win in some networks are off the chart.
- Ethical / privacy surface. RUM sends timing beacons tied to user sessions; must obey consent and retention policies.
When to reach for it¶
- You're a CDN / edge / DNS provider publishing periodic comparative performance posts.
- You want a defensible methodology (you can publish the statistical recipe; it's reproducible) rather than a marketing graph.
- You have enough user traffic across enough networks that the RUM cohort is large enough for trimean per (provider, network, day) to be tight.
When not to reach for it¶
- You lack global traffic — the RUM cohort won't cover enough networks. Use a hybrid (synthetic + partial RUM).
- Your metric is tails-sensitive (real-time / streaming applications). Trimean-of-connection-time is wrong there; use a percentile-based approach.
- You need per-request causality (why was user X's request slow?). RUM is an aggregate metric; tracing one specific request is the wrong tool.
Seen in¶
- sources/2026-04-17-cloudflare-agents-week-network-performance-update — canonical wiki instance. Cloudflare publishes its network-performance ranking (40 % → 60 % fastest in top 1,000 networks, Sept → Dec 2025; average 6 ms gap to next-fastest in Dec 2025) using this exact shape: RUM probe on Cloudflare error pages → fetch from CF + CloudFront + Google + Fastly + Akamai → trimean per (provider, network) → count networks won → publish headline plus average gap.