Skip to content

PATTERN Cited by 4 sources

AI Gateway provider abstraction

AI Gateway provider abstraction is the pattern of routing all application LLM calls through a single proxy endpoint that owns provider / model selection, secret injection, retry / fallback policy, rate limiting, logging, and cost accounting — so that the application only knows "I point at the gateway" and everything that could change about provider choice is reconfigured at the gateway, not in application code or deploy pipelines.

Mechanics

  • The application is configured with a single base URL (ANTHROPIC_BASE_URL / OPENAI_API_BASE / similar) pointing at the gateway, using the provider's native API shape for the call.
  • The gateway authenticates the caller via a separate substrate (API-gateway key, SSO JWT, workload identity).
  • On each request the gateway:
  • Resolves which upstream provider/model to use (static config, per-tenant config, or a fallback chain).
  • Injects the real upstream API key server-side (concepts/byok-bring-your-own-key).
  • Emits logs / metrics / audit records.
  • Forwards the request and streams the response back.

Why this pattern

  • Zero-code model swaps. New model releases, price shifts, or provider availability incidents become gateway-config changes. The application never redeploys.
  • Centralised observability. One view of LLM spend, latency, and error rates across heterogeneous providers — the app doesn't need to stitch provider-specific telemetry.
  • Rotation and revocation centralised. Keys live in one secrets store; rotation doesn't touch the application.
  • Uniform rate-limiting and policy enforcement. Per-tenant quotas, per-user quotas, token budgets all enforceable in one place.

Contrast

Related to patterns/middleware-worker-adapter (same concern-ownership philosophy) and patterns/protocol-compatible-drop-in-proxy (the AI-Gateway variant is usually protocol-compatible: the gateway speaks the upstream provider's API shape), but specialised to the LLM provider category where the combinatorics of {provider × model × key × quota × fallback} specifically motivate centralisation.

Seen in

Last updated · 200 distilled / 1,178 read