SYSTEM Cited by 3 sources
Unity Catalog (Databricks)¶
Unity Catalog (UC) is Databricks' unified governance solution for data and AI assets. Three distinct faces show up across ingested sources:
- As Databricks' internal governance/catalog service — the stateless-service-turned-Dicer-backed sharded in-memory cache from the Dicer case study.
- As the hub in a customer-facing data mesh — the global catalog Mercedes-Benz federates AWS Iceberg tables into and shares via Delta Sharing to Azure consumers.
- As the audit / telemetry substrate for Unity AI Gateway — coding-agent + MCP audit logs plus OpenTelemetry-sourced metrics/traces from all governed AI traffic land in UC-managed Delta tables, making AI telemetry a first-class Lakehouse dataset joinable with business data. See sources/2026-04-17-databricks-governing-coding-agent-sprawl-with-unity-ai-gateway + patterns/telemetry-to-lakehouse + systems/unity-ai-gateway.
Internal architecture (Dicer case study)¶
From the Dicer post, UC is the headline case study for Dicer's auto-sharder: originally a stateless service, extremely high read volume drove prohibitive latency because every request hit the backend database.
Why remote caching was rejected¶
- Cache must be incrementally updated and snapshot-consistent with storage.
- Customer catalogs can be gigabytes — partial or replicated snapshots in a remote cache would introduce substantial overhead.
Dicer integration outcome¶
- Sharded in-memory stateful cache across pods, assignment maintained by systems/dicer.
- Remote network calls replaced by local method calls within the owning pod.
- Cache hit rate: 90–95 % (Source: sources/2026-01-13-databricks-open-sourcing-dicer-auto-sharder).
- Drastic reduction in DB round-trips; DB load drops substantially.
Unity Catalog is thus a concrete example of concepts/dynamic-sharding beating both the stateless + remote-cache option and static-sharding as an architecture for a catalog-service read path.
Customer-facing role (Mercedes-Benz data mesh)¶
From the Mercedes-Benz case study, UC plays the global catalog role in a cross-cloud concepts/data-mesh:
- Centralises metadata and access-control across metastores, regions, and hyperscalers — one governance plane for AWS + Azure. This is the architectural realisation of concepts/hub-and-spoke-governance.
- Federates Iceberg tables from AWS Glue — registering them in UC so they can participate in systems/delta-sharing without the producer rewriting into Delta first. Format translation happens at the federation boundary.
- Speaks systems/delta-sharing — the open exchange protocol between UC metastores (cross-region, cross-cloud) and with external partners (suppliers).
The self-service orchestration layer over UC on this mesh is systems/ddx-orchestrator, which automates permission management and Sync-Job lifecycle so domain teams don't operate shares by hand.
(Source: sources/2026-04-20-databricks-mercedes-benz-cross-cloud-data-mesh)
Seen in¶
- sources/2026-01-13-databricks-open-sourcing-dicer-auto-sharder — UC's Dicer-backed in-memory-cache rollout; 90–95 % hit rate.
- sources/2026-04-20-databricks-mercedes-benz-cross-cloud-data-mesh — UC as global catalog + federation layer in Mercedes-Benz's AWS↔Azure data mesh; Iceberg-on-Glue federation, Delta-Sharing exchange.