Skip to content

PATTERN Cited by 1 source

Managed OTel ingestion direct to lakehouse

Managed OTel ingestion direct to lakehouse is the pattern of using a managed serverless OTLP receiver (gRPC + REST) as the only hop between OTel-instrumented clients and a governed columnar-storage destination (Delta Lake / Iceberg) — collapsing away intermediate brokers like Kafka and pushing operational complexity to the platform layer.

Mechanics

clients (OTel SDKs / collectors)
        │ OTLP/gRPC  (open-source collectors)
        │ REST       (framework SDKs like MLflow)
   managed OTel receiver  ◄── serverless, vendor-operated
   lakehouse Delta tables (UC / Polaris-governed)
   downstream consumers (SQL, dashboards, ETL, eval)

The pattern's defining shape:

  • One ingestion endpoint speaks OTLP/gRPC and HTTP REST.
  • No broker (Kafka / Pulsar / Kinesis) between client and storage.
  • Managed, not customer-operated.
  • Storage is a governed lakehouse table, not an APM-vendor backend.
  • Schema is OTel-native (spans / logs / metrics) plus optional vendor extensions.

Canonical instance: Databricks Zerobus → UC OTel Trace Tables (2026-05-22)

"Databricks supports ingesting OpenTelemetry (OTel) traces, logs, and metrics directly into Unity Catalog tables, using the OTel standard to separate instrumentation from storage. Databricks removes the operational complexity of traditional, multi-hop telemetry pipelines by providing a managed ingestion layer, transparently powered by Zerobus Ingest."

"With a 'single-sink' architecture, Zerobus Ingest simplifies observability by streaming data directly to the lakehouse. Existing OLTP-compatible collectors can point directly to this endpoint via gRPC, entirely bypassing intermediate message buses like Kafka. Zerobus Ingest acts as your high-throughput telemetry pipeline, handling ingestion and durability with zero infrastructure overhead."

— Source: sources/2026-05-22-databricks-observability-any-agent-anywhere-otel-unity-catalog

Components:

  • Receiver: Zerobus Ingest — managed, serverless, OTLP/gRPC + REST.
  • Storage: UC OTel Trace Tables — six MLflow-derived UC Delta views.
  • Instrumentation companion: MLflow OTel Tracing — the framework-side library.
  • Throughput floor: 200 QPS (account-team-escalation for higher).
  • Storage: unbounded; auto-liquid-clustered.
  • Per-experiment trace cap removed (a constraint of the prior MLflow architecture).

When this pattern is the right shape

Property Why this pattern wins
Single canonical analytical destination The lakehouse is already where analysts work; observability lands where the analytics is.
Lakehouse-resident governance is required UC column masking, row filtering, RBAC, audit logs apply automatically to traces.
Long-retention is a requirement Object-storage Delta is order-of-magnitude cheaper than SaaS APM retention.
Joining telemetry with business data is the question Lakehouse co-location enables joins APM can't.
Operational simplicity matters One managed receiver + one storage system to think about.
OTel-instrumented agents already exist Drop-in re-point; no re-instrumentation cost.

When this pattern is the wrong shape

  • Real-time alerting is the primary use case. Lakehouse query latency is seconds-to-minutes; APM / Prometheus is sub-second. Pair with an APM sidecar for alerting.
  • Multi-destination fan-out is needed. A single sink can't satisfy multiple consumers; broker-based architectures fan out cleanly.
  • Burst absorption is critical. Brokers buffer; managed receivers may apply back-pressure that propagates to clients.
  • Cross-organisation rendezvous. When telemetry from many independent producers must aggregate at a logical point, brokers are natural; lakehouse storage is not the same shape.

Composition with other patterns

Caveats

  • Vendor lock-on-lakehouse. Avoiding APM-vendor lock-in trades it for lakehouse-vendor lock-in. Open table formats (Delta + Iceberg) mitigate, but the managed receiver itself (Zerobus) is vendor-specific.
  • Single-sink claim is architectural marketing. The Kafka-bypass argument is plausible but rarely benchmarked.
  • Throughput floor is modest. 200 QPS on the canonical instance — high-traffic agent fleets may need account-team escalation.
  • Latency from emit to query is not characterised. The post does not state SLO numbers for Zerobus ingest end-to-end.
  • Sample-and-summarise alternatives may be cheaper when full trace retention isn't required. The pattern's economics assume the customer wants full lakehouse-resident retention; if they only need 1-week APM-style traces, traditional shapes may win on simplicity.
  • Managed receiver internals are opaque. Durability semantics, partition strategy, back-pressure behaviour are vendor-internal details.

Seen in

Last updated · 542 distilled / 1,571 read