CONCEPT Cited by 1 source
Customer-driven metrics¶
Customer-driven metrics are workload measurements that are driven by the customer's behaviour, independent of the infrastructure's response — queries per second, client connection count, scanned-objects rate. They are the exogenous input to an infrastructure that otherwise generates its own endogenous signals (CPU utilisation, memory pressure, queue depth) that change when the infrastructure is resized.
Named explicitly in MongoDB's 2026-04-07 predictive-auto-scaling retrospective:
"Instead we forecast metrics unaffected by scaling, which we call 'customer-driven metrics' — e.g., queries per second, number of client connections, and the scanned-objects rate. We assume these are independent of instance size or scaling actions." (Source: sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment)
Why the distinction matters¶
The hazard is the self-invalidating forecast:
- Predict CPU → scale to flatten CPU → CPU is flat → forecast was wrong by construction. CPU is endogenous to the control loop; forecasting it creates a circular dependency.
- Predict QPS → scale to handle QPS → QPS unaffected → forecast is checkable. QPS is exogenous to the scaler (customer workload causes QPS, not server capacity). Forecasting it is well-defined.
The architectural move: separate the control variable
(instance size) from the observable that drives it (CPU)
from the forecasted input (customer demand). Forecast the
exogenous input, then use a predictor (the Estimator) to map
(forecasted_demand × candidate_size) → expected_CPU — making
the forecast itself independent of the control action.
Not always exogenous¶
MongoDB's caveat:
"Sometimes this is false; a saturated server exerts backpressure on the customer's queries. But customer-driven metrics are normally exogenous."
A saturated primary rejects connections, times out queries, or triggers client-side backoff — at which point QPS observed on the server does depend on server capacity. This is why the customer-driven-metric assumption is load-bearing only in the non-saturated regime; near saturation, predictive scaling's confidence in the forecast should already have faded via the self-censoring gate. In steady state, measured QPS ≈ customer-generated QPS.
Canonical examples¶
| Metric | Exogenous? | Why |
|---|---|---|
| Queries per second | Usually yes | Customer's workload generates |
| Client connections | Usually yes | App-side pool sizing sets |
| Scanned-objects rate | Usually yes | Customer's query shape |
| CPU utilisation | No | Server capacity sets; control target |
| Memory pressure | No | Server capacity sets |
| Queue depth | No | Inter-arrival vs service rate |
| Cache hit rate | Partial | Customer's working set + cache size |
| P99 latency | No | Server capacity + workload shape |
Adjacent concepts¶
- Exogenous vs endogenous variables is the econometrics terminology — exogenous = determined outside the model, endogenous = determined within.
- Control input vs output from control theory — forecast inputs, compute outputs.
- Leading vs lagging indicators — customer-driven metrics are typically leading (QPS climbs before CPU climbs), which is independently useful for pre-emptive scaling.
Seen in¶
- sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment — MongoDB Atlas predictive scaler forecasts QPS, connection count, scanned-objects rate (customer-driven) rather than CPU (endogenous); explicitly names the concept and explains the self-invalidation remediation.
Related¶
- concepts/self-invalidating-forecast — the hazard class this concept remedies.
- concepts/predictive-autoscaling — the primary consumer of customer-driven forecasts.
- concepts/circular-dependency — the generic name for the forecast-self-invalidates shape.
- concepts/backpressure — the named mechanism that can violate the exogeneity assumption.
- systems/mongodb-atlas — canonical wiki instance.