PATTERN Cited by 1 source

Device blocklist from telemetry¶

A device blocklist from telemetry rolls out a new backend/feature/API to client devices by gathering compatibility signals from live sessions, aggregating per-device-class, and blocklisting classes with high failure rates from future sessions. The blocklist becomes the gating mechanism for continued rollout rather than a binary "ship vs. don't ship" decision.

When to reach for it¶

Rolling out a new client API or backend that has known but hard-to-enumerate compatibility issues (e.g. GPU driver bugs for a new graphics API, browser-engine quirks for a new JS feature, OS-specific behaviors for a new system call).
The device space is too large to pre-test exhaustively — different GPUs × OS versions × driver versions × browser versions are combinatorial.
You have dynamic fallback or another recovery mechanism so failures don't crash the session — they become data.

Shape¶

session starts on new backend
         │
         ▼
    run compatibility probe (or just use normally)
         │
         ├── success → continue
         │
         └── failure / fallback triggered
                  │
                  ▼
             emit telemetry event
                  │
                  ▼
         aggregator: per-device-class failure rates
                  │
                  ▼
         threshold breached → add class to blocklist
                  │
                  ▼
next session on that device-class → start on old backend directly

Key design choices¶

Device-class granularity — aggregate by GPU model × driver version × OS × browser. Too coarse → over-blocking; too fine → not enough data per class to trigger.
Threshold calibration — fallback-rate threshold above which a class is blocklisted. Depends on user tolerance for the mid-session hitch. Figma's case: "falling back from WebGPU to WebGL can cause a hitch, which we'd like to avoid."
Non-load-blocking probes — if the compatibility check itself is expensive (like async-readback on WebGPU), run it post-session-start so it doesn't regress startup latency. See concepts/synchronous-vs-asynchronous-readback.
Blocklist distribution — push the blocklist to clients on startup (small config); clients decide locally whether to attempt the new backend.
Rollback path — if driver / browser / OS updates fix a class's issues, un-blocklist. Requires ongoing monitoring and mechanism to revisit the blocklist.
Initial blocklist — pre-populate with classes known to fail from pre-production testing; the telemetry loop then expands the list as rollout scales.

Why it beats "feature flag everything"¶

A global feature flag gives one boolean per user — on or off. It can't express "WebGPU works for this user except on their external display with a different GPU" or "works until the user switches Wi-Fi and the driver gets reinitialized."

The per-device-class blocklist expresses the compatibility matrix more faithfully. Combined with patterns/runtime-backend-swap-on-failure, even blocklist misses become data rather than crashes.

Canonical instance¶

Figma's WebGPU rollout. Figma uses two overlapping mechanisms:

Initial blocklist from pre-production compatibility tests (pixel-readback probes against known-good outputs).
Ongoing blocklist expansion from fallback-rate telemetry — after the dynamic-fallback system was in place, rollout resumed gating on average-fallback-rate per device class.

"Using this approach, we were finally able to complete the rollout."

(Source: sources/2026-04-21-figma-rendering-powered-by-webgpu)

Sibling patterns¶

patterns/phased-cdn-rollout-passthrough-managed-auto — Cloudflare Shared Dictionaries rollout uses phase-gated expansion via observed outcomes. Same shape at a different substrate.
patterns/cohort-percentage-rollout — percentage-based rollout without the per-device classification. Complementary, not a replacement.

Seen in¶

sources/2026-04-21-figma-rendering-powered-by-webgpu — Figma WebGPU rollout. Initial blocklist + telemetry-expanded blocklist + dynamic fallback together closed the rollout loop.

concepts/synchronous-vs-asynchronous-readback — why the compatibility probe moved off the startup critical path.
patterns/runtime-backend-swap-on-failure — the fallback mechanism whose failure rate feeds the blocklist.
patterns/phased-cdn-rollout-passthrough-managed-auto
patterns/cohort-percentage-rollout