Skip to content

PATTERN Cited by 1 source

Remote config model routing

Intent

Decouple the model-routing decision from the code running the AI workload so operators can flip a switch at the control plane and have every running agent re-route within seconds — no code deploy, no on-call page, no coordinated downtime.

When to reach for it

  • Running AI workloads that span many concurrent jobs (CI pipelines, long-running agents, fleets of worker processes).
  • Model providers go down unpredictably; outages at 8 a.m. UTC when on-call is asleep happen.
  • A new frontier model ships every few weeks; flipping to the next generation should not require a re-deploy.
  • You want to A/B or canary model assignments per reviewer / per role / per tenant without per-call overhead.

Mechanism

  1. Config lives in KV (or an equivalent low-latency read-heavy store).
  2. Config read through a Cloudflare Worker (or equivalent edge function) that applies a filter-then-select pipeline: enforce per-provider enabled flags, drop disabled-provider models, pick primary + failback chain per role.
  3. Every AI workload consults the config at startup and/or at a short TTL (5 seconds in Cloudflare's instance).
  4. Flipping a value in KV re-routes every running job in ≤ the TTL.
  5. Config also carries failback chain overrides, so the full routing topology is reshapable from one Worker update.

Shape

  ┌──────────────┐                  ┌───────────────┐
  │ KV store     │ ◄─── operator ───│ admin UI /    │
  │  providers   │     edit          │ CLI / API     │
  │  reviewer-   │                  └───────────────┘
  │   models     │
  │  failback    │
  │   chains     │
  └──────┬───────┘
  ┌──────────────┐
  │ Config       │  ←  GET /config per reviewer
  │ Worker       │     filters by enabled providers,
  │              │     returns primary + fallbacks
  └──────┬───────┘
  ┌──────────────┐       ┌──────────────┐       ┌──────────────┐
  │ CI job #1    │       │ CI job #2    │  ...  │ CI job #N    │
  │ (AI review)  │       │ (AI review)  │       │ (AI review)  │
  └──────────────┘       └──────────────┘       └──────────────┘

Cloudflare's AI Code Review instance

  • Per-reviewer model assignments — one block per specialist (security, performance, code quality, documentation, release, AGENTS.md, engineering codex) naming primary model + fallback chain.
  • Providers block{ anthropic: { enabled: true }, openai: { enabled: true }, cloudflare: { enabled: true } }.
  • Filter-then-select logic:
function filterModelsByProviders(models, providers) {
  return models.filter((m) => {
    const provider = extractProviderFromModel(m.model);
    if (!provider) return true;      // Unknown provider → keep
    const config = providers[provider];
    if (!config) return true;        // Not in config → keep
    return config.enabled;           // Disabled → filter out
  });
}
  • Effect: "We can flip a switch in KV to disable an entire provider, and every running CI job will route around it within five seconds."

Why this pattern scales

  • Linear recovery on provider outages. One KV flip, global re-route.
  • Per-role experimentation. Route Documentation to Kimi K2.5 today; try Sonnet tomorrow; revert in seconds.
  • No re-deploy. CI job code is invariant under routing change.
  • Tenant isolation. Per-tenant config is a natural extension of the same shape.

Tradeoffs

  • Control-plane availability is load-bearing. If the config Worker is down, every AI workload falls back to its embedded defaults or errors. Defaults must be safe.
  • Config drift. Workloads at different stages of their TTL see different configs during a flip window. Usually tolerable at 5-second TTLs.
  • Debuggability. "Which model routed my job?" requires correlating workload logs with config snapshots. Pair with telemetry that tags each request with its resolved model.

Sibling patterns

  • vs. patterns/event-driven-config-refresh — event-driven refresh pushes changes to clients; remote-config-model-routing is the pull variant (clients poll the config Worker on demand). Same outcome, different network shape.
  • vs. patterns/automatic-provider-failover — failover handles the inflight provider-outage case; remote-config-model-routing handles the organisational / policy / canary case.
  • vs. patterns/central-proxy-choke-point — choke-point is where all traffic flows; remote-config-model-routing is how the choke point decides where to forward each call.

Seen in

Last updated · 200 distilled / 1,178 read