Skip to content

PATTERN Cited by 1 source

Behavioral metric over technical metric

When validating data-layer changes, prefer customer-behavior signals (playback starts, conversions, engagements) over infrastructure signals (latency, error rates, throughput) as the primary canary-analysis signal — because data corruption often produces technically-correct but semantically-wrong responses that infrastructure metrics cannot detect.

The pattern

  1. Identify the behavioral metric that most closely indicates customer impact in your domain
  2. Use it as the primary experiment signal for data canary analysis
  3. Keep infrastructure metrics as secondary/supporting signals — they still catch code-level regressions
  4. Tune thresholds and abort logic against the behavioral metric's noise floor and detection speed

Why

Data corruption can produce HTTP 200 responses with normal latency that are semantically wrong. A catalog entry that says a title doesn't exist is a valid response — no error, no timeout, no elevated latency. But the customer can't start playback.

Netflix's finding:

"SPS proved more reliable than latency or error rates for detecting catalog corruption because it directly measures customer impact, and data errors may not always manifest as application errors to our catalog metadata service."

Netflix's canonical instance

Starts Per Second (SPS) — actual customer playback attempts — is the primary signal in the Data Canary. Multi-tenant testing revealed that the playback-request tenant identifies failures fastest because it exercises the full metadata path end-to-end.

Result: 10× error differential between canary and baseline during controlled failure injection, detectable in 2.5–4 minutes.

Choosing your behavioral metric

The metric should be:

  • Directly impacted by the data being validated
  • High-volume enough to detect regression quickly (within your validation window)
  • Low-noise enough that real regression stands out from normal variance
  • End-to-end — measuring downstream customer behavior, not just the service's own output

Seen in

Last updated · 546 distilled / 1,578 read