Skip to content

SYSTEM Cited by 1 source

Unity AI Gateway (Databricks)

Unity AI Gateway is Databricks' productised instance of the AI-gateway provider abstraction pattern, specialised to coding agents + MCP integrations rather than just application LLM calls. Its job is to be the single governance + cost + telemetry plane for all coding-tool traffic in a Databricks customer's fleet.

Three-pillar architecture (from the 2026-04-17 launch post)

  1. Centralised security and audit.
  2. Every agent data-access flow logged in Unity Catalog (same governance substrate as Lakehouse data + ML assets).
  3. All tracing in MLflow (specifically MLflow 3 GenAI tracing — named for Claude Code integration).
  4. MCP servers "managed in Databricks" — the gateway is the policy point for MCP traffic, not just LLM traffic.
  5. Single-identity plane: developers authenticate once with Databricks credentials for all tools (GitHub, Atlassian, etc.), "no separate logins per service".
  6. Single bill and cost limits.
  7. Foundation Model API provides first-party inference for OpenAI, Anthropic, Gemini, and open models like Qwen.
  8. Admins can also "bring external capacity in", extending governance "to all your tokens, regardless of where they flow"patterns/unified-billing-across-providers.
  9. Gateway-enforced budgets are per-developer, not per-tool — admins give each developer one budget and the developer burns it on whichever tool of choice (Cursor / Codex / Gemini CLI / Claude Code / …).
  10. Full observability in the Lakehouse.
  11. Coding-tool metrics + traces land in Unity-Catalog-managed Delta tables via OpenTelemetry ingestion.
  12. Joinable with other Lakehouse datasets (Workday for adoption-by-org / region / seniority; PR-cycle data for velocity quantification) — patterns/telemetry-to-lakehouse.
  13. Surfaces rate-limit hits as a proactive capacity-planning signal.

Supported clients (at launch)

Relation to existing wiki entities

What the post does not disclose

  • Gateway internals: routing, fallback, rate-limiter algorithm, streaming handling, per-provider adapter shape.
  • MCP-governance mechanics: how the gateway inspects MCP traffic, auth flow from coding-tool → gateway → MCP → data source.
  • Telemetry schema landing in Delta tables.
  • Latency / throughput / cost-per-token / adoption numbers.

Tier-3 Databricks post — ingested because the problem framing (coding-agent sprawl) and three-pillar architecture are substantive, not because the internals are disclosed.

Seen in

Last updated · 200 distilled / 1,178 read