Skip to content

CONCEPT Cited by 1 source

Self-invalidating forecast

A self-invalidating forecast is the hazard class where a control loop forecasts a metric whose future value depends on the control action being taken on the forecast. Predicting the metric and acting on the prediction changes the metric, so the prediction was wrong — but "wrong" only in the sense that the action worked as intended.

Named directly by MongoDB:

"We can't just train a model based on recent fluctuations of CPU, because that would create a circular dependency: if we predict a CPU spike and scale accordingly, we eliminate the spike, invalidating the forecast." (Source: sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment)

Why this is a problem

Forecasting endogenous metrics breaks the usual model of "forecast → act → measure → evaluate forecast accuracy":

  • Model trained on historical CPU has learned historical responses including the historical scaler's behaviour.
  • Predict-and-scale eliminates the spike → new CPU trace no longer resembles training distribution.
  • Training-time error signal ("did forecasted CPU match actual CPU?") breaks: of course it didn't — we scaled.
  • Retraining on the new data teaches the model that spikes don't happen, which is wrong on any system that stops scaling.
  • The loop is unstable — forecast accuracy degrades monotonically under normal operation.

The fix: forecast exogenous inputs

The architectural move MongoDB named:

"Instead we forecast metrics unaffected by scaling, which we call 'customer-driven metrics' — e.g., queries per second, number of client connections, and the scanned-objects rate."

Customer-driven metrics are the upstream causes of CPU (customer workload → queries → server load → CPU utilization). They're independent of the scaling action — customer workload doesn't change because the server got bigger. Forecast those; then use a separate model (the Estimator) to map (forecasted_demand × candidate_size) → expected_CPU. The forecast is well-defined; the mapping is well-defined; the composition gives the right control signal without circularity.

Three-step refactor

Before (self-invalidating):

CPU_observed → forecast CPU_future → scale → (forecast was wrong)

After (forecast exogenous):

QPS_observed → forecast QPS_future
            → estimator(QPS_future, candidate_size) → expected_CPU
            → planner picks cheapest size under CPU ceiling
            → scale
            → QPS unchanged, QPS forecast checkable

The Estimator is trainable on a fixed dataset of (QPS, instance_size, CPU) triples that doesn't move with scaling — MongoDB trained theirs on 25 million sample points.

Not always perfectly exogenous

MongoDB's caveat:

"Sometimes this is false; a saturated server exerts backpressure on the customer's queries. But customer-driven metrics are normally exogenous."

A saturated server rejects connections, times out queries, forces client-side backoff — in the saturated regime, QPS is partially controlled by server capacity. This is why the customer-driven-metric assumption is load-bearing only in normal operation; near saturation, self-censoring should already have degraded forecast confidence and the scaler should fall back to reactive.

Other instances across systems design

The same hazard shape appears in adjacent control loops:

  • Forecast query latency → route traffic to faster node → latency flattens → forecast stops being useful. Remedy: forecast the load not the latency.
  • Forecast cache-miss rate → pre-warm cache → miss rate drops → forecast needs to discount the pre-warming effect. Remedy: forecast inputs (request distribution), compute miss rate from model.
  • Forecast queue depth → increase consumers → queue drains → forecast wrong. Remedy: forecast arrival rate, model service rate separately.

The common pattern: split the metric into the exogenous driver + the endogenous response; forecast the driver; model the response.

Relationship to circular dependency (deployment context)

The wiki already has a page on concepts/circular-dependency in the deployment context (deploy scripts depending on the service they're deploying). MongoDB uses the same name for this forecast-context hazard — the same structural shape (output depends on input, input depends on output) at a different layer of the stack.

Cross-reference:

Layer Deployment-context Forecast-context
Trigger "Fix the service" "Predict the metric"
Dependency Fix needs service reachable Metric depends on control action
Symptom Can't ship during incident Forecast degrades under use
Remedy Mirror / independent infra Forecast exogenous inputs

Seen in

Last updated · 200 distilled / 1,178 read