Skip to content

SYSTEM Cited by 3 sources

Unity Catalog (Databricks)

Unity Catalog (UC) is Databricks' unified governance solution for data and AI assets. Three distinct faces show up across ingested sources:

  1. As Databricks' internal governance/catalog service — the stateless-service-turned-Dicer-backed sharded in-memory cache from the Dicer case study.
  2. As the hub in a customer-facing data mesh — the global catalog Mercedes-Benz federates AWS Iceberg tables into and shares via Delta Sharing to Azure consumers.
  3. As the audit / telemetry substrate for Unity AI Gateway — coding-agent + MCP audit logs plus OpenTelemetry-sourced metrics/traces from all governed AI traffic land in UC-managed Delta tables, making AI telemetry a first-class Lakehouse dataset joinable with business data. See sources/2026-04-17-databricks-governing-coding-agent-sprawl-with-unity-ai-gateway + patterns/telemetry-to-lakehouse + systems/unity-ai-gateway.

Internal architecture (Dicer case study)

From the Dicer post, UC is the headline case study for Dicer's auto-sharder: originally a stateless service, extremely high read volume drove prohibitive latency because every request hit the backend database.

Why remote caching was rejected

  • Cache must be incrementally updated and snapshot-consistent with storage.
  • Customer catalogs can be gigabytes — partial or replicated snapshots in a remote cache would introduce substantial overhead.

Dicer integration outcome

Unity Catalog is thus a concrete example of concepts/dynamic-sharding beating both the stateless + remote-cache option and static-sharding as an architecture for a catalog-service read path.

Customer-facing role (Mercedes-Benz data mesh)

From the Mercedes-Benz case study, UC plays the global catalog role in a cross-cloud concepts/data-mesh:

  • Centralises metadata and access-control across metastores, regions, and hyperscalers — one governance plane for AWS + Azure. This is the architectural realisation of concepts/hub-and-spoke-governance.
  • Federates Iceberg tables from AWS Glue — registering them in UC so they can participate in systems/delta-sharing without the producer rewriting into Delta first. Format translation happens at the federation boundary.
  • Speaks systems/delta-sharing — the open exchange protocol between UC metastores (cross-region, cross-cloud) and with external partners (suppliers).

The self-service orchestration layer over UC on this mesh is systems/ddx-orchestrator, which automates permission management and Sync-Job lifecycle so domain teams don't operate shares by hand.

(Source: sources/2026-04-20-databricks-mercedes-benz-cross-cloud-data-mesh)

Seen in

Last updated · 200 distilled / 1,178 read