Skip to content

CONCEPT Cited by 1 source

Customer-driven metrics

Customer-driven metrics are workload measurements that are driven by the customer's behaviour, independent of the infrastructure's response — queries per second, client connection count, scanned-objects rate. They are the exogenous input to an infrastructure that otherwise generates its own endogenous signals (CPU utilisation, memory pressure, queue depth) that change when the infrastructure is resized.

Named explicitly in MongoDB's 2026-04-07 predictive-auto-scaling retrospective:

"Instead we forecast metrics unaffected by scaling, which we call 'customer-driven metrics' — e.g., queries per second, number of client connections, and the scanned-objects rate. We assume these are independent of instance size or scaling actions." (Source: sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment)

Why the distinction matters

The hazard is the self-invalidating forecast:

  • Predict CPU → scale to flatten CPU → CPU is flat → forecast was wrong by construction. CPU is endogenous to the control loop; forecasting it creates a circular dependency.
  • Predict QPS → scale to handle QPS → QPS unaffected → forecast is checkable. QPS is exogenous to the scaler (customer workload causes QPS, not server capacity). Forecasting it is well-defined.

The architectural move: separate the control variable (instance size) from the observable that drives it (CPU) from the forecasted input (customer demand). Forecast the exogenous input, then use a predictor (the Estimator) to map (forecasted_demand × candidate_size) → expected_CPU — making the forecast itself independent of the control action.

Not always exogenous

MongoDB's caveat:

"Sometimes this is false; a saturated server exerts backpressure on the customer's queries. But customer-driven metrics are normally exogenous."

A saturated primary rejects connections, times out queries, or triggers client-side backoff — at which point QPS observed on the server does depend on server capacity. This is why the customer-driven-metric assumption is load-bearing only in the non-saturated regime; near saturation, predictive scaling's confidence in the forecast should already have faded via the self-censoring gate. In steady state, measured QPS ≈ customer-generated QPS.

Canonical examples

Metric Exogenous? Why
Queries per second Usually yes Customer's workload generates
Client connections Usually yes App-side pool sizing sets
Scanned-objects rate Usually yes Customer's query shape
CPU utilisation No Server capacity sets; control target
Memory pressure No Server capacity sets
Queue depth No Inter-arrival vs service rate
Cache hit rate Partial Customer's working set + cache size
P99 latency No Server capacity + workload shape

Adjacent concepts

  • Exogenous vs endogenous variables is the econometrics terminology — exogenous = determined outside the model, endogenous = determined within.
  • Control input vs output from control theory — forecast inputs, compute outputs.
  • Leading vs lagging indicators — customer-driven metrics are typically leading (QPS climbs before CPU climbs), which is independently useful for pre-emptive scaling.

Seen in

Last updated · 200 distilled / 1,178 read