Skip to content

CONCEPT Cited by 1 source

Dynamic backend fallback

Dynamic backend fallback is a resilience pattern where a system starts a session using a preferred backend and, on mid-session failure of that backend, swaps to a peer backend without tearing down the session or crashing the user.

Applies to any substrate where the application can target multiple interchangeable backends behind a shared abstraction layer:

  • Graphics APIsWebGPU WebGL on device-lost, driver-bug-triggered failure.
  • LLM providers — preferred model → fallback model on provider outage; see patterns/automatic-provider-failover.
  • Storage backends — primary tier → secondary tier on disk/pool failure.
  • Network paths — preferred route → alternate route on path outage.

Shape

session starts on preferred backend
     mid-session failure
       (device-lost, 503, disk-error, etc.)
     diagnose: retryable error or fatal?
         ├── retryable → retry on same backend
         └── fatal → swap to fallback backend
              re-hydrate session state on new backend
              user continues; some hitch is acceptable

Key design choices

  1. Shared abstraction layer — both backends must be behind the same interface (see concepts/graphics-api-abstraction-layer) so the session doesn't care which one is live. Without this, a fallback requires a full session teardown.
  2. State rehydration — whatever resources were loaded into the failed backend (textures, uploaded buffers, cached state) must be re-initialized on the fallback. Usually the same pathway the initial session-start used.
  3. Error classification — not every failure is fatal. Distinguish retryable errors (transient) from fatal errors (device lost, adapter revoked). Only fatal cases trigger fallback.
  4. Hitch tolerance — the user sees a brief pause during rehydration. Acceptable compared to crashing, unacceptable if it happens on every session. Rollout gating ties directly into patterns/device-blocklist-from-telemetry — if a device class has a high fallback rate, blocklist the preferred backend for that class so new sessions start on the fallback directly.
  5. Direction-only — usually one-way. Upgrading back from the fallback to the preferred backend mid-session is uncommon; the next session starts fresh and re-attempts the preferred path.

Why it's first-class rather than just error-handling

Before dynamic fallback, the common pattern was to detect incompatible backends at session start (static device checks, feature tests, UA strings). That works for the cases you know about at load time. It fails for mid-session-lost devices: a WebGPU device can be revoked after hours of use, a database replica can fail mid-transaction, a provider endpoint can drop after the session was validated. The dynamic-fallback pattern treats mid-session failure as a first-class event with a known recovery path, not as an unrecoverable error.

Seen in

  • sources/2026-04-21-figma-rendering-powered-by-webgpu — Figma's WebGPU migration required dynamic fallback because mid-session WebGPU failures were observed on Windows after the static-probe blocklist was in place. The fallback reuses the existing WebGL-context-loss / WebGPU-device-loss handler shape but swaps backends instead of recreating the same one. Closes the rollout loop with a per-device fallback-rate blocklist that feeds back into static-gating decisions for future sessions.
Last updated · 200 distilled / 1,178 read