CONCEPT Cited by 1 source

Customer impact hours metric¶

Definition¶

A customer impact hours metric is a reliability program's top-line metric defined as summed hours of customer-visible impact from high-severity and filtered medium-severity incidents, scoped by cause (change-triggered / external / capacity / etc.).

The metric is deliberately an imperfect analog of customer sentiment — it's cheaper than direct customer-sentiment survey data while correlating well enough that project-level improvements move it.

Canonical disclosure¶

Slack's 2025-10-07 Deploy Safety retrospective canonicalises the choice with verbatim (Source: sources/2025-10-07-slack-deploy-safety-reducing-customer-impact-from-change):

"Hours of customer impact from high severity and selected medium severity change-triggered incidents."

"Selected" means filtered-by-post-hoc-impact-analysis — Slack severity levels convey impending or current impact, not final impact, so a human curation pass is required to distill medium-severity incidents down to the ones that mattered.

The three-layer chain¶

Slack articulates the metric's position in a three-layer chain:

Customer sentiment <-> Program Metric <-> Project Metric

Customer sentiment is the truth the program is trying to move. Direct measurement requires surveys, NPS, support-ticket sentiment — slow, noisy, expensive.
Program metric is the imperfect analog — defined well enough to be computed from existing incident data; close enough to customer sentiment that moving it correlates with moving sentiment.
Project metric is per-project — "did automatic rollback reduce MTTR on Webapp backend?" — measurable in the specific project's substrate.

Verbatim on the coupling (Source: sources/2025-10-07-slack-deploy-safety-reducing-customer-impact-from-change):

"They're all connected, but it's challenging to know for a specific project how much it is going to move the top line metric."

This is load-bearing on how projects are justified + evaluated — you cannot directly attribute a specific quarterly top-line move to a specific project. You measure the project metric during the project; you measure the program metric in aggregate after a 3-6 month lag; you accept a loose causal chain.

The four metric-design criteria¶

Slack names four explicit criteria:

Measure results. Not effort, not output. Results.
Understand what is measured (real vs analog). The metric is an analog; don't confuse it with the underlying truth (customer sentiment).
Consistency in measurement, especially subjective portions. The "selected medium severity" filter is the subjective portion — apply consistently across quarters.
Continually validate the measurement matches customer sentiment — "with the leaders having the direct conversations with customers". The program metric's legitimacy is only as good as its ongoing validation against the sentiment it's trying to analog.

Operational trade-offs in metric design¶

Incident-count vs incident-hours. Count is easier to compute but doesn't distinguish a 2-min blip from a 4-hour outage. Hours weight by duration.
All-incidents vs severity-filtered. All-incidents over-counts noise from low-severity internal-only incidents; severity-filtered requires a curation discipline.
All-cause vs change-triggered. A program investing in deploy safety should track only the change-triggered subset (to avoid credit/blame for external-cause moves); a full reliability program tracks all-cause.
Trailing vs leading. Incident-derived metrics are trailing — you measure what has already happened. Leading indicators (canary metrics during deploy, alert volume, change-fail-rate) are available faster but weaker on customer- sentiment correlation.

See concepts/trailing-metric-patience for the patience discipline required when the metric trails delivery.

Why "imperfect analog" is the right framing¶

The metric's imperfection is feature, not bug:

Perfect would be unmeasurable. True customer sentiment cannot be measured continuously at scale.
Direct would be slow. NPS / survey cadence is quarterly; incident data is daily.
Pure proxy would decouple. A metric unconnected to customer impact (e.g., deploy-count, MTTR-over-all-incidents) optimises against the substrate without moving sentiment.

The metric sits at the intersection of "measurable from existing data, daily" and "moves when customer sentiment moves." The three-layer-chain framing names exactly why: the chain is the cost of measurability.

Caveats¶

Selection bias in "selected medium severity". The curation pass is a human judgement call; quarter-to-quarter reviewer change can make the metric non-comparable.
Severity-level drift. If the org's severity scale shifts (more sev-2s reclassified as sev-3s because of policy change), the metric moves without any underlying customer impact change.
Root-cause attribution. Change-triggered-only scope requires per-incident cause classification, which is post-hoc and sometimes contested.
The metric can be gamed. Over-filtering medium-severity incidents reduces the metric without improving customer sentiment. The "continually validate" criterion is the check.
No per-region / per-customer-segment decomposition in the canonical Slack disclosure. A customer segment with 100% breakage can be invisible if their traffic is small in aggregate.

Seen in¶

sources/2025-10-07-slack-deploy-safety-reducing-customer-impact-from-change — Slack's canonical program-metric choice; the three-layer chain framing; the "selected medium severity" filter; the four design criteria.