SYSTEM Cited by 3 sources
Unity AI Gateway (Databricks)¶
Unity AI Gateway is Databricks' productised instance of the AI-gateway provider abstraction pattern, specialised to coding agents + MCP integrations rather than just application LLM calls. Its job is to be the single governance + cost + telemetry plane for all coding-tool traffic in a Databricks customer's fleet.
Generalisation to org-wide agent populations (2026-05-20 disclosure)¶
The 2026-05-20 Governing AI agents at scale with Unity Catalog post (sources/2026-05-20-databricks-governing-ai-agents-at-scale-with-unity-catalog) generalises Unity AI Gateway from coding-agent scope to org-wide agent scope and adds three named architectural extensions that this page didn't previously canonicalise.
Scope generalisation¶
The 2026-04-17 launch positioned the Gateway around coding-agent sprawl (Cursor / Codex / Claude Code / Gemini CLI). The 2026-05-20 post generalises to every department's agents: dev (coding agents), analytics (forecasting agents), sales-ops (lead-scoring agents), support (ticket-routing agents), marketing (personalization), finance (reconciliation). The architectural surface didn't change — the "every model call, every tool invocation, every agent interaction flows through the gateway" principle now covers all of them.
Three new feature surfaces¶
The Gateway as disclosed on 2026-04-17 had centralised audit + cost + observability. The 2026-05-20 post discloses three additional named layers attached to the same proxy:
| Layer | What it does | Wiki entity |
|---|---|---|
| Service Policies | Pre-execution per-tool-call evaluation; UC functions attached to registered MCPs; returns allow/deny/consent; fail-closed on deny |
systems/uc-service-policies |
| Guardrails | Inline content scanning of every model call — inputs (PII, jailbreak), outputs (hallucinations, sensitive content); fail-closed | systems/unity-ai-gateway-guardrails |
| Inference Tables | Full payload of every model call (exact prompt + exact response + tokens + latency) written to UC-managed Delta tables; customer-controlled retention | systems/inference-tables |
| Budgets | Per-user / per-group monthly spend thresholds with alerts; hard enforcement on roadmap | systems/unity-ai-gateway-budgets |
Four-pillar repositioning¶
The 2026-05-20 post repositions the Gateway in the four-pillar framing (concepts/four-pillars-of-agent-governance):
- Pillar 1 (Delegated access) — three-layer composition: OBO permissions + Service Policies + Guardrails. The Gateway is the enforcement fabric where all three layers run.
- Pillar 2 (Data-centric AI governance) — Gateway writes Inference Tables + UC audit logs to the lakehouse, joinable with business data; substrate for Lakewatch (agentic SIEM).
- Pillar 3 (Cost intelligence) — usage-tracking + Budgets.
- Pillar 4 (Open and interoperable) — single governed endpoint across Databricks-hosted models + Azure OpenAI + AWS Bedrock + Anthropic; framework-agnostic across LangGraph / CrewAI / OpenAI SDK / Anthropic SDK / AutoGen / LlamaIndex.
Identity propagation (now explicit)¶
The 2026-05-20 post is the first to explicitly disclose OBO as the data-access mechanism: "identity flows end to end, from the user who asks the question to the specific table row the agent retrieves." The Gateway is the identity-translation point — agents inherit the invoking user's UC permissions in real time via on-behalf-of token passing, not via shared service accounts.
Three-pillar architecture (from the 2026-04-17 launch post)¶
- Centralised security and audit.
- Every agent data-access flow logged in Unity Catalog (same governance substrate as Lakehouse data + ML assets).
- All tracing in MLflow (specifically MLflow 3 GenAI tracing — named for Claude Code integration).
- MCP servers "managed in Databricks" — the gateway is the policy point for MCP traffic, not just LLM traffic.
- Single-identity plane: developers authenticate once with Databricks credentials for all tools (GitHub, Atlassian, etc.), "no separate logins per service".
- Single bill and cost limits.
- Foundation Model API provides first-party inference for OpenAI, Anthropic, Gemini, and open models like Qwen.
- Admins can also "bring external capacity in", extending governance "to all your tokens, regardless of where they flow" — patterns/unified-billing-across-providers.
- Gateway-enforced budgets are per-developer, not per-tool — admins give each developer one budget and the developer burns it on whichever tool of choice (Cursor / Codex / Gemini CLI / Claude Code / …).
- Full observability in the Lakehouse.
- Coding-tool metrics + traces land in Unity-Catalog-managed Delta tables via OpenTelemetry ingestion.
- Joinable with other Lakehouse datasets (Workday for adoption-by-org / region / seniority; PR-cycle data for velocity quantification) — patterns/telemetry-to-lakehouse.
- Surfaces rate-limit hits as a proactive capacity-planning signal.
Supported clients (at launch)¶
- Cursor
- Codex CLI
- Gemini CLI
- Claude Code — integration referenced via MLflow 3 tracing docs.
Relation to existing wiki entities¶
- Same shape as Cloudflare's internal AI engineering stack (sources/2026-04-20-cloudflare-internal-ai-engineering-stack) — single proxy, BYOK, central telemetry, single-identity plane. Databricks specialises for coding-agent clients + MCP governance where the Cloudflare instance specialised for internal application workloads.
- Parent pattern: patterns/ai-gateway-provider-abstraction.
- Audit substrate: systems/unity-catalog.
- Tracing substrate: systems/mlflow.
- Governed tool surface: systems/model-context-protocol.
What the post does not disclose¶
- Gateway internals: routing, fallback, rate-limiter algorithm, streaming handling, per-provider adapter shape.
- MCP-governance mechanics: how the gateway inspects MCP traffic, auth flow from coding-tool → gateway → MCP → data source.
- Telemetry schema landing in Delta tables.
- Latency / throughput / cost-per-token / adoption numbers.
Tier-3 Databricks post — ingested because the problem framing (coding-agent sprawl) and three-pillar architecture are substantive, not because the internals are disclosed.
Seen in¶
-
sources/2026-05-22-databricks-how-world-bank-group-uses-databricks-to-eradicate-poverty-through-shared-knowledge — AI Gateway as control plane for an agentic-router multi-Genie deployment. World Bank Group's Knowledge 360 / Data 360 platform names the Databricks AI Gateway as "centralized control over agent access, cost management and security as the system grew more complex." This is the first wiki source where the AI Gateway gates an intent-domain-decomposer agentic-router fronting multiple per-domain Genie instances + a RAG agent + a visualisation agent rather than a coding-agent fleet — confirms the 2026-05-20 Governing AI agents at scale post's scope-generalisation thesis (every department's agents, not just coding agents). Caveat: name-drop altitude only — no policy mechanism, no per-agent identity-flow detail, no per-call latency, no fail-closed-vs-open posture for this deployment disclosed. Pattern instance: patterns/intent-domain-decomposer-agentic-router composed with the Gateway as control plane.
-
sources/2026-05-20-databricks-governing-ai-agents-at-scale-with-unity-catalog — generalised Unity AI Gateway from coding-agent scope to org-wide agent scope; canonicalised three-layer agent control (permissions / Service Policies / Guardrails); first disclosure of Service Policies + Guardrails + Inference Tables + Budgets + Lakewatch as named architectural surfaces. Repositioned in four-pillar framing.
- sources/2026-04-17-databricks-governing-coding-agent-sprawl-with-unity-ai-gateway — launch post (originally coding-agent-only scope).
Related¶
- systems/unity-catalog — audit + logging substrate.
- systems/mlflow — tracing substrate.
- systems/model-context-protocol — governed tool surface.
- systems/databricks-foundation-model-api — inference capacity the gateway routes to.
- concepts/coding-agent-sprawl — problem class it addresses.
- concepts/centralized-ai-governance — three-pillar framing.
- patterns/ai-gateway-provider-abstraction — parent pattern.
- patterns/central-proxy-choke-point — architectural posture.
- patterns/telemetry-to-lakehouse — observability shape.
- patterns/unified-billing-across-providers — cost posture.
- companies/databricks.