PATTERN Cited by 1 source
Remote config model routing¶
Intent¶
Decouple the model-routing decision from the code running the AI workload so operators can flip a switch at the control plane and have every running agent re-route within seconds — no code deploy, no on-call page, no coordinated downtime.
When to reach for it¶
- Running AI workloads that span many concurrent jobs (CI pipelines, long-running agents, fleets of worker processes).
- Model providers go down unpredictably; outages at 8 a.m. UTC when on-call is asleep happen.
- A new frontier model ships every few weeks; flipping to the next generation should not require a re-deploy.
- You want to A/B or canary model assignments per reviewer / per role / per tenant without per-call overhead.
Mechanism¶
- Config lives in KV (or an equivalent low-latency read-heavy store).
- Config read through a Cloudflare Worker (or equivalent edge function) that applies a filter-then-select pipeline: enforce per-provider
enabledflags, drop disabled-provider models, pick primary + failback chain per role. - Every AI workload consults the config at startup and/or at a short TTL (5 seconds in Cloudflare's instance).
- Flipping a value in KV re-routes every running job in ≤ the TTL.
- Config also carries failback chain overrides, so the full routing topology is reshapable from one Worker update.
Shape¶
┌──────────────┐ ┌───────────────┐
│ KV store │ ◄─── operator ───│ admin UI / │
│ providers │ edit │ CLI / API │
│ reviewer- │ └───────────────┘
│ models │
│ failback │
│ chains │
└──────┬───────┘
│
▼
┌──────────────┐
│ Config │ ← GET /config per reviewer
│ Worker │ filters by enabled providers,
│ │ returns primary + fallbacks
└──────┬───────┘
│
▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ CI job #1 │ │ CI job #2 │ ... │ CI job #N │
│ (AI review) │ │ (AI review) │ │ (AI review) │
└──────────────┘ └──────────────┘ └──────────────┘
Cloudflare's AI Code Review instance¶
- Per-reviewer model assignments — one block per specialist (security, performance, code quality, documentation, release, AGENTS.md, engineering codex) naming primary model + fallback chain.
- Providers block —
{ anthropic: { enabled: true }, openai: { enabled: true }, cloudflare: { enabled: true } }. - Filter-then-select logic:
function filterModelsByProviders(models, providers) {
return models.filter((m) => {
const provider = extractProviderFromModel(m.model);
if (!provider) return true; // Unknown provider → keep
const config = providers[provider];
if (!config) return true; // Not in config → keep
return config.enabled; // Disabled → filter out
});
}
- Effect: "We can flip a switch in KV to disable an entire provider, and every running CI job will route around it within five seconds."
Why this pattern scales¶
- Linear recovery on provider outages. One KV flip, global re-route.
- Per-role experimentation. Route Documentation to Kimi K2.5 today; try Sonnet tomorrow; revert in seconds.
- No re-deploy. CI job code is invariant under routing change.
- Tenant isolation. Per-tenant config is a natural extension of the same shape.
Tradeoffs¶
- Control-plane availability is load-bearing. If the config Worker is down, every AI workload falls back to its embedded defaults or errors. Defaults must be safe.
- Config drift. Workloads at different stages of their TTL see different configs during a flip window. Usually tolerable at 5-second TTLs.
- Debuggability. "Which model routed my job?" requires correlating workload logs with config snapshots. Pair with telemetry that tags each request with its resolved model.
Sibling patterns¶
- vs. patterns/event-driven-config-refresh — event-driven refresh pushes changes to clients; remote-config-model-routing is the pull variant (clients poll the config Worker on demand). Same outcome, different network shape.
- vs. patterns/automatic-provider-failover — failover handles the inflight provider-outage case; remote-config-model-routing handles the organisational / policy / canary case.
- vs. patterns/central-proxy-choke-point — choke-point is where all traffic flows; remote-config-model-routing is how the choke point decides where to forward each call.
Seen in¶
- sources/2026-04-20-cloudflare-orchestrating-ai-code-review-at-scale — canonical production instance on CI-integrated AI code review; 5-second re-route on KV flip; providers + failback-chain overrides controllable from the same Worker.
Related¶
- patterns/automatic-provider-failover — sibling inflight-resilience pattern.
- patterns/central-proxy-choke-point — complementary posture.
- systems/cloudflare-kv — storage substrate.
- systems/cloudflare-workers — runtime substrate.
- systems/cloudflare-ai-gateway — consumer.
- systems/cloudflare-ai-code-review — canonical instance.