CONCEPT Cited by 1 source
Instrumentation-storage decoupling¶
Instrumentation-storage decoupling is the architectural property of using a stable, portable protocol (OpenTelemetry / OTLP) as the boundary between the agent / service emitting telemetry and the system that stores and analyses it — so that the agent runtime is not coupled to the storage backend and either side can swap independently.
The boundary¶
agent code (LangGraph / OpenAI SDK / Anthropic SDK / framework-agnostic)
│
▼
OTel SDK / collector ◄── PROTOCOL BOUNDARY (OTLP/gRPC, OTLP/HTTP)
│
▼
storage backend (lakehouse / APM / Honeycomb / Datadog / etc.)
The OTel SDK is the producer side of the boundary. Anything that speaks OTLP is the consumer side. The agent code knows nothing about which backend stores the trace.
Canonical statement (Databricks, 2026-05-22)¶
"Databricks supports ingesting OpenTelemetry (OTel) traces, logs, and metrics directly into Unity Catalog tables, using the OTel standard to separate instrumentation from storage."
"Any OTel-compatible client can export traces to this endpoint, including popular AI agent frameworks across many programming languages."
"Q: Can I use this for agents running outside of Databricks? A: Yes, the agent can be running anywhere. In fact the support assistant agent example that was used for this blog is deployed locally."
— Source: sources/2026-05-22-databricks-observability-any-agent-anywhere-otel-unity-catalog
The "agent runs anywhere" property is the user-facing payoff of the decoupling.
Why it matters for system design¶
Without decoupling: agent code carries a vendor-specific tracing SDK. Switching from Datadog to Honeycomb requires re-instrumenting every agent. Adding a new agent in a new language requires a new SDK port. Every storage-side innovation requires a coordinated client-side rollout.
With decoupling:
- Agents in customer VPCs, on developer laptops, in third-party clouds, or inside the platform itself all emit to the same place.
- Storage backend can be swapped — same OTLP traffic can be redirected to Datadog, Honeycomb, Tempo, or a lakehouse without changing agent code.
- Multi-language fleets are tractable because the OTel SDK is implemented per language with a shared protocol.
- New observability capabilities can ship at the storage layer without an agent re-roll — the 2026-05-22 launch is itself an example: existing OTel-instrumented agents "can point directly to this endpoint via gRPC".
Structural payoffs¶
- Vendor portability: telemetry investment is not locked to a single backend's SDK / proprietary protocol.
- Drop-in re-pointing: existing OTel collectors can swap their exporter config to point at a new endpoint without re-instrumenting.
- Cross-language consistency: same protocol semantics across Java, Python, Go, JS, Rust, etc.
- Independent evolution: instrumentation library cadence and storage-backend cadence don't have to synchronise.
- "Agent anywhere": agents running outside the storage vendor's environment are first-class citizens — a critical property for enterprise customers who can't move every workload onto one platform.
Caveats¶
- OTel coverage gaps. The OTel project's adapter coverage is uneven; framework-specific autolog (e.g.
mlflow.langchain.autolog()) may be richer than what generic OTel provides. - Semantic conventions are still evolving. OTel's gen-AI semantic conventions are newer than its HTTP / RPC conventions; storage backends differ on which fields they parse vs treat as opaque.
- Performance characteristics differ across SDKs. Java / Go OTel SDKs are mature; Python's overhead is higher; JS's batching defaults differ.
- Decoupling does not mean equivalence. Different storage backends may parse the same OTLP payload differently; switching backends is cheaper than re-instrumenting but not free.
- Vendor extensions break the boundary. When MLflow adds
_otel_annotationsfor assessments / feedback / expectations (an MLflow-specific extension), that data is not portable to a non-MLflow backend.
Relationship to other concepts¶
- Composes with concepts/single-sink-telemetry-architecture: decoupling is client-side; single-sink is ingest-side. Together they describe the 2026-05-22 Databricks architecture.
- Specialised by systems/opentelemetry: OTel is the concrete protocol that operationalises the decoupling.
- Operationalised by systems/mlflow-otel-tracing: MLflow's OTel-tracing surface is the agent-side instrumentation that exports across the boundary.
- Sibling to concepts/context-propagation-otel: context propagation is what travels across the boundary; instrumentation-storage decoupling is that the boundary exists at all.
Seen in¶
- sources/2026-05-22-databricks-observability-any-agent-anywhere-otel-unity-catalog — canonical statement: "using the OTel standard to separate instrumentation from storage"; the "agent can be running anywhere" FAQ; the framework-agnostic "any OTel-compatible client" framing.
Related¶
- concepts/single-sink-telemetry-architecture — paired ingest-side concept.
- concepts/lakehouse-native-observability — the storage-side posture this enables.
- concepts/observability — parent concept.
- concepts/context-propagation-otel — what travels across the decoupled boundary.
- systems/opentelemetry — the protocol that operationalises the decoupling.
- systems/zerobus-ingest — storage-side endpoint.
- systems/uc-otel-trace-tables — storage-side schema.
- systems/mlflow-otel-tracing — instrumentation-side library.
- patterns/managed-otel-ingestion-direct-to-lakehouse — pattern that exploits the decoupling.
- patterns/telemetry-to-lakehouse — generalisation.
- companies/databricks