Skip to content

SYSTEM Cited by 1 source

OpenTracing

OpenTracing is a vendor-neutral API specification for distributed tracing, originally developed as a CNCF project. It defines the core primitives — span, trace, span context, tags, logs, baggage, and semantic conventions — and a pluggable tracer interface so that application code can be instrumented once and routed to any backend (Jaeger, Zipkin, vendor APMs, etc.).

Status: merged into OpenTelemetry (2019)

OpenTracing was archived in 2019 when it and OpenCensus merged to form OpenTelemetry, which is now the de-facto distributed-tracing standard. Instrumentation written against OpenTracing still runs today via bridges / compatibility shims, but new instrumentation should target OpenTelemetry. The core primitives (span, span context, semantic conventions) carried over essentially unchanged.

What carries over

  • Spans with parent-child relationships — the causal graph OpenTracing introduced is preserved by OpenTelemetry. This is what primitives like adaptive paging depend on to identify the "closest team to the problem" at alert time.
  • Semantic conventions — standardised tag names (http.method, db.statement, etc.) moved to OpenTelemetry with minor renames; the interop story across OpenTracing- instrumented and OpenTelemetry-instrumented services is good.
  • Context propagation formats — W3C Trace Context subsumed OpenTracing's propagators.

Zalando instantiation

companies/zalando rolled out OpenTracing platform-wide during Cyber Week preparations (Phase 2 of their SRE evolution), starting with tier-1 hot-path browse services and expanding to tier-2 the following year. They adopted traffic-source conventions tagging each request's originating class (App / Web / push notifications / load tests) — see concepts/traffic-source-tagging-in-traces — to support capacity planning. The causality data from traces plus OpenTracing semantic conventions is the substrate for their adaptive paging alert handler (Source: ).

Seen in

  • — Zalando's fleet-wide OpenTracing rollout, tier-gated expansion, traffic-source tagging, adaptive-paging application.
  • — Zalando Kotlin Guild names opentracing-toolbox as the default tracing library for new Kotlin backend services on Spring Boot; ships a dedicated opentracing-kotlin submodule. Confirms OpenTracing is still the live instrumentation standard at Zalando in 2021 despite the 2019 OpenTelemetry merger.
  • — canonicalises OpenTracing's role as the platform substrate for four SRE-team-owned capabilities in 2019: Adaptive Paging, Throughput Calculator, SLO Reporting Tool, and Operation-Based SLOs. Names the two critical artifacts: Zalando-specific Semantic Conventions (in addition to the standard ones) + an API to consume tracing data that lets ops-primitive authors query trace graphs by CBO + time range.
  • sources/2022-04-27-zalando-operation-based-slos — canonicalises the error-tag SLI primitive that makes operation-based SLOs transport-agnostic. Availability SLOs at Zalando used to be 5xx-rate; the OpenTracing error tag lets application code mark an operation as conceptually failed even when the HTTP response is 200 OK (e.g. the second graceful-degradation fallback returned a reduced-quality payload). "OpenTracing's error tag makes it a lot easier for engineers to signal an operation as conceptually failed." Elevates OpenTracing's semantic conventions from a tracing-display nicety to the load-bearing primitive for SLI definition across RPC + non-HTTP paths.
Last updated · 542 distilled / 1,571 read