PATTERN Cited by 1 source
Telemetry to Lakehouse¶
Telemetry to Lakehouse is the pattern of landing operational / tool / agent telemetry directly into governed open-table-format tables (typically Delta Lake or Iceberg) instead of an APM / observability-vendor sidecar — so the telemetry becomes a first-class Lakehouse dataset joinable with business data.
Mechanics¶
- Clients emit OpenTelemetry metrics + traces (standard protocol, avoids vendor lock).
- A managed ingestion pipeline writes OTel data into
Lakehouse-resident tables (
Unity-Catalog-managed Delta tablesin the Databricks instance). - Tables are governed under the same catalog / IAM / audit posture as the rest of the enterprise's data — one policy surface for telemetry + business data.
- Analysts query telemetry with the same SQL + dashboard tooling as HR, finance, CRM data.
Canonical instance: Unity AI Gateway (2026-04-17)¶
Databricks' coding-agent post states the design explicitly: "With our OpenTelemetry ingestion, coding tool metrics and traces are automatically centralized to Unity Catalog-managed Delta tables." (Source: sources/2026-04-17-databricks-governing-coding-agent-sprawl-with-unity-ai-gateway.)
Three use cases named in the post:
- Track adoption per org: "Join AI Gateway metrics with Workday to map GenAI adoption by department, region, or seniority, helping identify where to target enablement."
- Quantify developer velocity: "A 20% increase in token usage per developer drove a 15% reduction in pull request cycle time, directly linking AI tool usage to increased developer velocity."
- Proactive capacity planning: "Monitor users hitting rate limits to data-justify securing additional capacity or dedicated throughput before productivity is throttled."
Each of these requires joining gateway telemetry with a non-telemetry dataset (HR org chart, CI/CD PR metrics, capacity-cost tables). That join is the point of the pattern.
Why Lakehouse instead of APM¶
- APM vendors silo telemetry. Datadog / New Relic / Grafana Cloud store telemetry in their own backend; joining with Workday or Jira is ETL-out-of-APM, which breaks the freshness
- governance story.
- Business-value questions require business-data joins. "Are AI tools changing PR cycle time?" is a question about PR cycle time, not about LLM latency. APM can't answer it.
- Governance is unified. Same RBAC, same retention policies, same audit log covers telemetry + revenue data + HR data. No "observability data is special" escape hatch.
Costs / caveats¶
- Latency vs APM. Lakehouse query latency is seconds-to- minutes, not APM's sub-second. This pattern is for analytical telemetry questions, not real-time alerting. Real-time alerting typically still lives in an APM / Prometheus / Carnaval sidecar.
- Schema discipline required. Telemetry evolves fast; making it useful to analysts means stable columns / enums.
- Vendor-lock-on-lakehouse tradeoff. Avoids APM-vendor lock-in; trades it for lakehouse-vendor lock-in. Open table formats (Delta / Iceberg) mitigate.
- OTel as the wire-protocol boundary keeps the pattern Lakehouse-vendor-portable: Databricks today, could be Snowflake or Trino + Iceberg tomorrow.
Relation to other patterns¶
- Pairs with patterns/central-proxy-choke-point — the gateway is the natural place to emit the OTel, because it sees all traffic for all tools.
- Specialisation of concepts/observability — same underlying discipline, different storage substrate.
Seen in¶
- sources/2026-04-17-databricks-governing-coding-agent-sprawl-with-unity-ai-gateway — Unity AI Gateway → Unity-Catalog-managed Delta tables via OpenTelemetry; first-class "telemetry-joined-with-Workday" framing.