CONCEPT Cited by 1 source
Reactive auto-scaling¶
Reactive auto-scaling is capacity control that observes over- or under-utilisation and then changes capacity — the standard industry default from 2010s-era AWS Auto Scaling Groups through MongoDB Atlas's pre-2025 scaler. Its sibling-and-opposite is predictive auto-scaling, which forecasts demand and scales ahead.
Structural latency¶
Reactive scaling has a hard floor set by the sum of four stages:
reactive_latency = detection_time # threshold + hysteresis window
+ decision_time # pick size / count
+ provisioning_time # allocate / boot
+ warmup_time # traffic live on new capacity
Spikes shorter than reactive_latency see every request hit
pre-spike capacity and experience the full tail-latency blow-up
regardless of scaling policy. See
concepts/scaling-latency for the decomposition in a cluster-
scaler context, and concepts/spiky-traffic for the traffic
regime this breaks on.
Detection hysteresis¶
Reactive scalers don't fire on every threshold crossing — flapping would thrash capacity and cost more than it saves. They impose a sustained-over/under hysteresis window before triggering:
- MongoDB Atlas (pre-2025): "scales up after a few minutes of overload, or a few hours of underload" (Source: sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment). Scale-up latency ≪ scale-down latency — the asymmetric-risk framing.
The hysteresis is additive to provisioning time, not parallel.
One-step-at-a-time constraint¶
For tiered / stepped capacity primitives (M10, M20, … M60 on MongoDB Atlas; discrete replica counts on ASGs), reactive scalers typically move one step per decision:
"It only scales between adjacent tiers; for example, if an M60 replica set is underloaded, Atlas will scale it down to M50, but not directly to any tier smaller than that. If the customer's demand changes dramatically, it takes several scaling operations to reach the optimum server size." (Source: sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment)
The rationale is conservatism (large jumps on a transient may over-scale); the cost is N × (hysteresis + scaling_op) to settle under a sharp demand shift. Predictive scaling relaxes this because the forecast supplies the target size directly.
Self-interference during overload¶
A reactive scaler responding to overload must execute the scaling op on the overloaded system. MongoDB's framing: "an overloaded server is bad for performance, and if it's really slammed, it could interfere with the scaling operation itself." Extreme-overload states can delay the scaling op that would resolve them — pathological feedback.
Where reactive scaling still wins¶
Reactive scaling doesn't go away when predictive ships; it becomes the backstop for forecast failures:
- Unpredictable demand — workloads with no seasonality and no short-term trend; the Forecaster cannot help.
- Forecast error — even seasonal workloads have aperiodic components; a reactive backstop catches what the forecaster misses.
- Model exclusion — workloads the Estimator isn't accurate on (MongoDB: ~13% of replica sets excluded from predictive scaling, reactive-only).
- Scale-down in conservative rollouts — MongoDB's 2025 production predictive scaler is scale-up-only; reactive scaler owns scale-down as the higher-consequence direction.
"All customers who enabled auto-scaling (about a third) will soon have both predictive and reactive auto-scaling." Two scalers, different roles, same control loop.
Seen in¶
- sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment — MongoDB Atlas's pre-2025 reactive auto-scaler, explicitly characterised with its latency floor, one-tier-at-a-time constraint, and self-interference hazard; retained post-2025 as scale-down default + scale-up backstop when predictive forecasts fail.
Related¶
- concepts/predictive-autoscaling — the sibling-and-complement concept.
- concepts/scaling-latency — the latency floor reactive scaling always pays.
- concepts/tier-based-instance-sizing — the discrete-size catalog that creates the one-step-at-a-time constraint.
- concepts/elasticity — the property reactive scaling approximates but can't fully deliver under short spikes.
- concepts/spiky-traffic — the traffic pattern reactive scaling can't smooth.
- systems/mongodb-atlas — canonical wiki instance.
- systems/aws-auto-scaling-groups — classical AWS reactive primitive.