SYSTEM Cited by 1 source
UC OTel Trace Tables (Databricks)¶
UC OTel Trace Tables are the Unity Catalog-managed Delta tables (and MLflow-managed Databricks SQL views) that hold OpenTelemetry spans, logs, metrics, and MLflow trace metadata for agents instrumented through MLflow's OTel tracing — created by MLflow during experiment setup, populated by Zerobus Ingest, and exposed under the same UC governance as enterprise data.
Definition (from the source)¶
"This setup creates Unity Catalog tables for OpenTelemetry spans, logs, and metrics. The underlying data is stored in OpenTelemetry-compliant table formats, and the MLflow service automatically creates Databricks SQL views alongside them that transform the OpenTelemetry data into an MLflow-friendly format for easier querying and analysis." — Source: sources/2026-05-22-databricks-observability-any-agent-anywhere-otel-unity-catalog
Schema surface (six derived views)¶
| Table / View | Granularity | Purpose |
|---|---|---|
<prefix>_otel_spans |
One row per span | "Detailed span-level execution data for each request" |
<prefix>_otel_logs |
One row per log event | "Structured log/event data captured during execution" |
<prefix>_otel_metrics |
One row per metric sample | "Numerical telemetry captured during execution" |
<prefix>_otel_annotations |
One row per annotation | "MLflow-specific trace data that is not a standard OTel signal, including metadata, tags, assessments/feedback, expectations, and run links" |
<prefix>_trace_unified |
One row per trace | "a consolidated view that assembles trace data into a single record per trace, including raw span data and trace metadata" |
<prefix>_trace_metadata |
One row per trace ID | "MLflow tags, metadata, and assessments grouped by trace ID; more performant than the unified view when you only need MLflow trace metadata" |
The first three are standard OTel signals. _annotations is the MLflow extension carrying assessment / feedback / expectation / run-link data that has no native OTel correlate. _trace_unified and _trace_metadata are MLflow-derived consolidations for query ergonomics.
Why six tables instead of one¶
The post does not state this explicitly, but the structural reasoning:
- Spans / logs / metrics are different OTel signal cardinalities — colocating them in one table would force null-heavy schemas and slow scans.
_annotationsis MLflow-specific (LLM-judge feedback, human labels, expectation overrides) — separating it from raw OTel keeps the OTel surface protocol-clean._trace_unifiedis the "give me the whole trace as one record" convenience surface for SQL ad-hoc analysis._trace_metadatais a deliberate denormalisation: trace-level metadata only (no spans inlined), so dashboards that join on trace_id but don't need span detail can skip the wide-record cost. "more performant than the unified view when you only need MLflow trace metadata".
Operational properties¶
- Auto liquid-clustered: "With the latest product update, the tables are automatically liquid clustered to keep the data optimally organized." The clustering key is not disclosed but presumably trace_id / time-based.
- Materialized view recommendation for scale: "For larger trace volumes, however, you should create a materialized view on top of the derived views and incrementally refresh it to maintain query performance." No volume threshold is stated.
- Storage limit: none.
- MLflow per-experiment trace cap: "Previous limits on traces per experiment are no longer applicable" — UC-resident storage replaces the older capped-trace-count model.
- Change Data Feed (CDF) is supported (per the post) so trace tables can feed downstream ETL: "a pipeline could monitor trace patterns and trigger alerts when latency exceeds defined thresholds, tool failures spike, or token usage deviates from expected baselines".
Governance properties (inherited from UC)¶
The structural payoff: trace tables get UC's governance substrate automatically, no AI-specific configuration.
"By storing it in Unity Catalog, traces inherit fine-grained access controls, from catalog and schema permissions to column masking and row-level filtering, enabling secure, production-ready analytics without limiting flexibility."
This matters because trace data contains prompts and responses:
"Prompts and responses, however, often contain sensitive information, so treating trace data as governed data is critical."
Without UC, an OTel store would need a bolt-on PII handling layer. UC provides:
- Column masking — redact prompt-text or response-text columns from non-privileged readers.
- Row-level filtering — restrict access by tenant / org-unit / classification tag.
- Catalog/schema RBAC — coarse-grain access control inherited from the rest of UC.
- Audit logs — every read of trace data is itself audited.
- Tag-driven policy — if
data-classification.is_pii=truepropagates to trace columns, ABAC rules already in place over business data apply transparently.
The 2026-05-22 post explicitly notes "This feature does not apply any special handling to PII" — the design choice is to outsource PII handling to the existing UC pipeline rather than re-implement at the trace-table layer.
Consumers¶
- MLflow Experiment UI — search, filter, drill into spans, annotate / label / score with judges. Native dashboards: "trace volume, errors, latency, token usage, and cost".
- Ad-hoc SQL — "the trace tables are still just Delta tables in Unity Catalog. You can build a custom AI/BI Dashboard against them and write standard SQL".
- Genie spaces — "By exposing trace tables through Genie, teams can enable natural-language analysis over their telemetry data".
- ETL pipelines — CDF-driven incremental processing for alerting / remediation / aggregation.
- Evaluation dataset bootstrap — MLflow uses a SQL warehouse to "search and materialize dataset records" from these tables for prod-traces-as-eval-substrate workflows.
- Continuous evaluation — "MLflow can automatically evaluate live traces using the same judges, helping us quickly detect regressions, drift, and emerging failure patterns".
Sibling substrate: Inference Tables¶
Both Inference Tables (2026-05-20 disclosure) and UC OTel Trace Tables land observability data as UC-Delta. Different granularity:
- Inference Tables — one row per model call; captures verbatim prompt + response + tokens + latency at the Unity AI Gateway proxy choke point.
- UC OTel Trace Tables — one row per span within a trace; captures per-step execution path (tool calls, LLM calls, retrieval, etc.) at the agent OTel SDK (in-process instrumentation).
The two compose: Inference Tables answer "what was sent to and received from the model" for full-payload audit; OTel Trace Tables answer "what path did the agent take, and which step was the bottleneck" for execution-shape analysis. Both are first-class UC datasets queryable with the same SQL.
Caveats¶
- The six-view surface is MLflow-specific. Customers who export OTel directly via OTLP/gRPC without going through MLflow likely write to a different / simpler shape; the post focuses on the MLflow-managed flow.
- No public schema for individual columns — the post lists tables and their semantic purpose but does not specify column names or types. Practitioners must inspect the created tables.
- Liquid-clustering key not disclosed. Default behaviour matters when filtering on non-clustered columns.
- Materialized view recommendation has no volume threshold — "larger trace volumes" is qualitative.
- CDF integration is described, not benchmarked. Latency from trace ingest → CDF-emitted alert is not stated.
- Cost dashboards use list prices by default. "Native cost metrics rely on standard list prices, which can be off for teams that have negotiated rates" — custom-SQL contract-pricing logic must be added by the customer.
- Per-tenant isolation in shared experiments not addressed. If multiple teams share an experiment, governance is at the UC catalog/schema level rather than within the trace tables.
Seen in¶
- sources/2026-05-22-databricks-observability-any-agent-anywhere-otel-unity-catalog — canonical disclosure; the six-view schema, auto-liquid-clustering, MV recommendation, and "Previous limits on traces per experiment are no longer applicable" removal of the MLflow trace cap are all in this source.
Related¶
- systems/zerobus-ingest — producer.
- systems/mlflow-otel-tracing — instrumentation companion that creates these tables on experiment setup.
- systems/mlflow — broader ML lifecycle platform.
- systems/opentelemetry — wire protocol.
- systems/unity-catalog — governance substrate.
- systems/delta-lake — physical storage format.
- systems/inference-tables — sibling full-payload audit substrate; different granularity, same governance surface.
- concepts/lakehouse-native-observability — the architectural posture this system instantiates.
- concepts/single-sink-telemetry-architecture — the upstream shape.
- concepts/instrumentation-storage-decoupling — what makes the schema OTel-portable.
- concepts/audit-trail — what these tables provide for AI traffic.
- patterns/managed-otel-ingestion-direct-to-lakehouse — the broader pattern.
- patterns/telemetry-to-lakehouse — generalisation.
- patterns/component-level-latency-from-otel-spans — query pattern over
_otel_spans. - companies/databricks