CONCEPT Cited by 1 source
Traffic-source tagging in traces¶
Traffic-source tagging is the discipline of attaching a traffic-class tag to the root span (and propagating it via baggage / trace context) of every request, so that each request can be attributed at query time to its originating surface — App / Web / push notifications / load tests / internal batch jobs, etc. Capacity planning and alerting then become segmentable along this tag.
Why it's load-bearing¶
Generic trace tags (http.method, http.status_code,
service.name) answer "what happened?" but not "on whose
behalf?". For capacity planning, the important decomposition
is ratios between traffic classes — e.g., if the checkout
p99 breaches the SLO, is it App or Web? — because App / Web
traffic:
- Has different geographic distribution.
- Has different caching behavior (mobile clients cache less aggressively; web clients have HTTP caches).
- Has different request shapes (batched vs individual).
- Scales differently under a sales event (App frequently dominates mobile-first regions; Web dominates desktop regions).
Without the tag, these decompositions collapse into a single number and capacity planning operates on noise.
The load-test tag is the important one¶
Zalando's statement of practice:
"During the instrumentation, we have adopted additional conventions that help us identify the traffic sources: App, Web, push notifications, load tests. This allows us to better understand traffic patterns and perform capacity planning based on the request ratios between incoming traffic and the respective parts of our platform." — sources/2020-10-07-zalando-how-zalando-prepares-for-cyber-week
The load-tests tag is especially load-bearing because Zalando runs live load tests in production — the load-test traffic is real traffic hitting real systems from simulators. Without the tag, load-test traffic pollutes:
- Capacity planning dashboards — load-test requests get counted as real user traffic, inflating projected capacity.
- Customer-impact alerts — a load-test-induced degradation fires customer-impact pages.
- Product-performance metrics — conversion / checkout rate becomes noisy during test windows.
Explicitly tagging it solves all three with a single convention.
Implementation options¶
- Request header + root-span tag: the ingress / edge
injects a
X-Traffic-Source: app|web|push|loadtestheader based on (a) the authenticating client, (b) user agent, (c) a special test-client credential. The ingress also populates the tag on the root span. - Baggage propagation: the root span sets the tag, and OpenTelemetry / OpenTracing baggage carries it to downstream spans so every service's span inherits it. Expensive in hop count but gives complete attribution.
- Fixed test-account users: load-test simulators authenticate as a known test account; the edge sets the tag based on the account class. This is Zalando's preferred model — test sales orders use "test products that can be clearly differentiated from real customer orders."
Cardinality cost¶
The tag is low cardinality (single-digit distinct values) so it doesn't contribute meaningfully to index size — unlike user-ID or session-ID tags. This makes it effectively free to add to every span while being high-signal for slicing.
Prerequisites¶
- A standardised tracing substrate — OpenTracing or OpenTelemetry — with consistent baggage / span tag semantics.
- Identifiable traffic-source at the edge — either by auth class, known test account, synthetic user agent, or header.
- Downstream consumer awareness — dashboards, alert rules, SLO definitions updated to slice on the tag.
Related mechanisms¶
- User-class tagging (guest vs logged-in vs premium) — same mechanism, different axis.
- Synthetic-traffic tagging for observability — the general case; traffic-source tagging is the specific instantiation with real-user-vs-simulator as the cut.
- concepts/adaptive-paging — traffic-source tags are an input to adaptive-paging heuristics (e.g., a load-test alert should page the load-test team, not the service owner).
Seen in¶
- sources/2020-10-07-zalando-how-zalando-prepares-for-cyber-week — explicit App / Web / push / load-test convention adopted as part of fleet-wide OpenTracing rollout.