CONCEPT Cited by 1 source

Unified model catalog¶

Definition¶

A unified model catalog is the product-surface property of an AI platform where one catalog, one API, one credential model, one spend dashboard abstracts over many upstream providers, many modalities (text, image, video, speech, embeddings, rerankers), and many model families. Callers never enumerate which provider a model belongs to — they name the model and the catalog handles provider selection, authentication, fallback, spend attribution, and observability.

Previously ingested instances of the related three-pillar governance framing (centralised AI governance) are authorisation / audit / spend-oriented; this concept is the product-surface counterpart — the property that makes "use any model" an API call, not a procurement exercise.

Why a single catalog¶

The 2026-04-16 Cloudflare post's headline framing:

"70+ models across 12+ providers — all through one API, one line of code to switch between them, and one set of credits to pay for them."

Cites AIDB Intel's pulse survey: "the average company today is calling 3.5 models across multiple providers" — no single provider has a holistic view; the catalog is the only vantage point that does. The argument generalises: as soon as an application needs more than one model (typical agent workflow: "classify with a fast small model, plan with a large reasoning model, execute with a lightweight model"), the cost of managing N provider relationships + N sets of secrets + N billing invoices + N rate-limit policies + N retry-logic code paths compounds. A unified catalog collapses N to 1 along each of those axes.

Four collapse axes¶

API surface. One call shape (env.AI.run(model_string, ...)) with the provider selector inside the model string (e.g. 'anthropic/claude-opus-4-6', '@cf/moonshotai/kimi-k2.5'). Canonical instance of patterns/unified-inference-binding.
Credential. One authentication substrate (SSO / Zero Trust JWT / account credential) reaches any model; per-provider keys live in the gateway (concepts/byok-bring-your-own-key).
Billing. One invoice, one credit pool (patterns/unified-billing-across-providers). Custom-metadata per request feeds per-user / per-tenant / per-workflow attribution.
Observability. One dashboard for spend, latency, error rate, cache-hit ratio — across all providers and all modalities.

Catalog scope widens from text LLM to multimodal¶

A narrow "one LLM proxy, many LLM providers" catalog is the starting point; the 2026-04-16 post widens scope to image, video, and speech models in the same catalog with the same API shape: "We're excited to be expanding access to models from Alibaba Cloud, AssemblyAI, Bytedance, Google, InWorld, MiniMax, OpenAI, Pixverse, Recraft, Runway, and Vidu ... we're expanding our model offerings to include image, video, and speech models so that you can build multimodal applications." The AIDB-Intel 3.5-models-per-company figure becomes a floor, not a ceiling, as multimodal agents add tool calls against dedicated speech-to-text, text-to-image, etc.

The catalog property now requires:

Schema-unifying across modalities — one call shape has to accept text prompts, image inputs, video inputs, and audio inputs without forking per-modality APIs.
Tokenisation / cost models across modalities — text LLMs are token-billed; image/video/audio models are unit-billed (or second-billed for video). The catalog's pricing dashboard has to represent all units.
Failover semantics across modalities — patterns/automatic-provider-failover is well-defined for same-capability models; for image / video / speech where each provider may have materially different quality characteristics, the catalog has to decide what "equivalent" means per modality.

Ingested instances¶

Cloudflare AI Gateway + Workers AI (2026-04-16 AI Platform post): 70+ models across 12+ providers (Anthropic, OpenAI, Google, Alibaba Cloud, AssemblyAI, Bytedance, InWorld, MiniMax, Pixverse, Recraft, Runway, Vidu, + Workers AI @cf/…). Callable via env.AI.run(...) today, REST API in coming weeks. BYO-model via Cog containerisation (Enterprise + design partners).
Databricks Unity AI Gateway (2026-04-17 post): Foundation Model API for first-party inference (OpenAI / Anthropic / Gemini / Qwen) + BYO external capacity. Catalog scoped to LLMs + coding-agent clients.

Both named the same three-pillar framing (concepts/centralized-ai-governance); this concept is the first pillar (unified API surface) extended with scope beyond LLMs.

patterns/unified-inference-binding — the API-surface collapse axis.
patterns/ai-gateway-provider-abstraction — the gateway-level mechanism that makes the catalog possible.
patterns/unified-billing-across-providers — the billing collapse axis.
patterns/central-proxy-choke-point — the authorisation collapse axis.
patterns/automatic-provider-failover — the resilience primitive the catalog enables ("models available on multiple providers" failover automatically).
concepts/centralized-ai-governance — the governance-side sibling concept.
concepts/byok-bring-your-own-key — the credential-collapse prerequisite.
concepts/coding-agent-sprawl — the original motivating problem on the Databricks ingest.
systems/cloudflare-ai-gateway — canonical customer-facing instance.
systems/workers-ai — first-party-plus-catalog instance.
systems/unity-ai-gateway — adjacent vendor instance specialised to coding tools.