CONCEPT Cited by 1 source
Unified metric semantics¶
Unified metric semantics is the discipline of keeping metric names, label schemas, and metadata dimensions consistent across multiple storage backends so that users don't have to know (or care) which backend serves their query.
In a dual-tier observability architecture (aggregated TSDB + raw lakehouse), the two backends have very different performance characteristics and internal schemas — but the user-facing metric namespace is a single logical surface. The platform internally routes queries to the appropriate backend based on cardinality, time range, and query shape.
The "emit once, route many" posture¶
Service owners instrument their application with one standardised emission interface — typically an OTel SDK with the platform's conventions. That single emission stream is then:
- Aggregated — certain label combinations are pre- collapsed and forwarded to the TSDB (see Telegraf + patterns/aggregation-shield-for-tsdb-cardinality).
- Preserved raw — the full-cardinality stream is also written to the lakehouse (see systems/hydra).
- Possibly tee'd to other sinks — cost-analytics warehouse, long-term archive, external vendor integration.
From the user's perspective: metric_name{labels...} means the
same thing everywhere. Alert rules, dashboards, and ad-hoc
queries use the same name + label schema across both paths.
Why it's the load-bearing design principle¶
Without unified semantics, a dual-tier architecture becomes two products engineers have to learn — one for aggregated real-time queries, one for raw high-cardinality debugging. The cognitive tax alone would tip adoption decisions toward "keep everything in the TSDB and pay the cost." Unified semantics lets the platform team invisibly shift storage substrates based on cost / scale / freshness trade-offs without user action.
Implementation discipline¶
- Single emission interface — OTel-based, with platform- specific attribute conventions enforced in the SDK.
- Single metric catalog — every metric name, label schema, and valid label-value range is registered centrally; drift is detected and alerted on.
- Query router — chooses backend based on query properties (which label selectors, what time range, what aggregation). Transparent to the user.
- Version / migration management — when a metric's schema changes, both backends migrate in lockstep, or the query router serves the union for a deprecation window.
Seen in¶
- sources/2026-05-05-databricks-10-trillion-samples-a-day-scaling-beyond-traditional-monitoring — canonical instance. "A key design principle of Hydra is that engineers should not need to understand our ingestion architecture. Whether a metric is accessed through the TSDB-backed aggregated path, or the Lakehouse-backed raw metric path, the interface remains consistent. Metric names, label semantics, and metadata dimensions are unified across environments. Service teams emit metrics once using a standardized interface. The platform handles aggregation, raw preservation, ingestion, storage, and query routing."