SYSTEM Cited by 1 source
OpenTracing¶
OpenTracing is a vendor-neutral API specification for distributed tracing, originally developed as a CNCF project. It defines the core primitives — span, trace, span context, tags, logs, baggage, and semantic conventions — and a pluggable tracer interface so that application code can be instrumented once and routed to any backend (Jaeger, Zipkin, vendor APMs, etc.).
Status: merged into OpenTelemetry (2019)¶
OpenTracing was archived in 2019 when it and OpenCensus merged to form OpenTelemetry, which is now the de-facto distributed-tracing standard. Instrumentation written against OpenTracing still runs today via bridges / compatibility shims, but new instrumentation should target OpenTelemetry. The core primitives (span, span context, semantic conventions) carried over essentially unchanged.
What carries over¶
- Spans with parent-child relationships — the causal graph OpenTracing introduced is preserved by OpenTelemetry. This is what primitives like adaptive paging depend on to identify the "closest team to the problem" at alert time.
- Semantic conventions — standardised tag names
(
http.method,db.statement, etc.) moved to OpenTelemetry with minor renames; the interop story across OpenTracing- instrumented and OpenTelemetry-instrumented services is good. - Context propagation formats — W3C Trace Context subsumed OpenTracing's propagators.
Zalando instantiation¶
companies/zalando rolled out OpenTracing platform-wide during Cyber Week preparations (Phase 2 of their SRE evolution), starting with tier-1 hot-path browse services and expanding to tier-2 the following year. They adopted traffic-source conventions tagging each request's originating class (App / Web / push notifications / load tests) — see concepts/traffic-source-tagging-in-traces — to support capacity planning. The causality data from traces plus OpenTracing semantic conventions is the substrate for their adaptive paging alert handler (Source: ).
Seen in¶
- — Zalando's fleet-wide OpenTracing rollout, tier-gated expansion, traffic-source tagging, adaptive-paging application.
- — Zalando Kotlin Guild names opentracing-toolbox as the default tracing library for new Kotlin backend services on Spring Boot; ships a dedicated opentracing-kotlin submodule. Confirms OpenTracing is still the live instrumentation standard at Zalando in 2021 despite the 2019 OpenTelemetry merger.
- — canonicalises OpenTracing's role as the platform substrate for four SRE-team-owned capabilities in 2019: Adaptive Paging, Throughput Calculator, SLO Reporting Tool, and Operation-Based SLOs. Names the two critical artifacts: Zalando-specific Semantic Conventions (in addition to the standard ones) + an API to consume tracing data that lets ops-primitive authors query trace graphs by CBO + time range.
- sources/2022-04-27-zalando-operation-based-slos —
canonicalises the
error-tag SLI primitive that makes operation-based SLOs transport-agnostic. Availability SLOs at Zalando used to be 5xx-rate; the OpenTracingerrortag lets application code mark an operation as conceptually failed even when the HTTP response is 200 OK (e.g. the second graceful-degradation fallback returned a reduced-quality payload). "OpenTracing's error tag makes it a lot easier for engineers to signal an operation as conceptually failed." Elevates OpenTracing's semantic conventions from a tracing-display nicety to the load-bearing primitive for SLI definition across RPC + non-HTTP paths.
Related¶
- systems/opentelemetry — successor standard; target for new instrumentation.
- systems/opentracing-toolbox — Zalando's integration library with Spring Boot + Kotlin submodules.
- systems/zalando-adaptive-paging — the first ops-primitive built on the Tracing API.
- systems/zalando-throughput-calculator — uses tracing fan-out data for capacity projection.
- systems/zalando-service-level-management-tool — consumes
the
error-tag SLI primitive for operation-based SLOs. - concepts/observability
- concepts/adaptive-paging · concepts/critical-business-operation · concepts/operation-based-slo
- concepts/graceful-degradation — the
errortag lets reduced-quality fallbacks register as CBO failures despite HTTP 200 responses. - concepts/traffic-source-tagging-in-traces