CONCEPT Cited by 1 source

Tier-based instance sizing¶

Tier-based instance sizing is the capacity abstraction where a managed service exposes a discrete, ordered catalog of instance / cluster sizes — MongoDB Atlas's M10, M20, M30, …, M60, … — rather than a continuous capacity dial. Each tier maps to a specific (CPU count, RAM size, I/O budget) tuple on the underlying cloud provider, priced per hour.

MongoDB's framing:

"We sell MongoDB server sizes as a set of 'tiers' — named M10, M20, and so on — which map to specific instance sizes in each cloud provider." (Source: sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment)

"Atlas customers decide how many MongoDB servers to deploy in Atlas, what cloud providers and regions to deploy them in, and what size of server: how many CPUs, how much RAM, and so on. Each server in a replica set must use the same tier."

Why the abstraction exists¶

The alternative is asking customers to specify raw AWS / GCP / Azure instance types, which varies by cloud, breaks portability, and exposes pricing complexity. Tiers:

Abstract the underlying SKU across AWS + GCP + Azure — same M30 behaviour regardless of substrate.
Enable replica-set-homogeneity invariants ("each server in a replica set must use the same tier") without the customer tracking per-cloud SKU equivalences.
Simplify pricing to one dimension per tier.
Enable managed autoscaling — the scaler moves between named tiers rather than arbitrary (CPU, RAM) points.

The cost is that tiers are discrete; moving between them is quantised.

How it interacts with autoscaling¶

Tiered sizing creates the one-tier-at-a-time constraint on reactive scalers (see concepts/reactive-autoscaling):

"It only scales between adjacent tiers; for example, if an M60 replica set is underloaded, Atlas will scale it down to M50, but not directly to any tier smaller than that. If the customer's demand changes dramatically, it takes several scaling operations to reach the optimum server size."

The rationale is conservative: bigger jumps risk over-scaling on a transient. The cost is N × (detect + op) to settle under sharp shifts.

Predictive scaling (see concepts/predictive-autoscaling) relaxes this: the forecast supplies the target tier, and the scaler moves there in one operation — "scale it directly to the right server size, skipping intermediate tiers."

Within-tier heterogeneity¶

MongoDB's caveat:

"Each server in a replica set must use the same tier. (With exceptions.)"

The exception is independently-scaled analytics nodes — a replica set can have its OLAP-workload secondary on a different tier than the OLTP primary + secondaries. See MongoDB's docs on independent analytics-node tiers. The exception is additive to, not a replacement for, the homogeneity invariant.

Tier-as-resource-bundle vs. raw provisioning¶

Tiers are a specific point on the capacity-abstraction spectrum:

Raw cloud provisioning — pick AWS m5.xlarge vs r5.2xlarge; full power, full complexity. Appropriate for self-managed deployments.
Tiered — MongoDB Atlas M10 → M60. Appropriate for managed services where the vendor abstracts the cloud.
Serverless — no tier to pick; pay per request / connection / compute-second. Appropriate for spiky unpredictable workloads where capacity shouldn't need thinking about (elasticity property fully delivered).

MongoDB's Atlas tier catalog + autoscaler sits at the tiered position; Aurora DSQL / Lambda / Atlas Serverless push further toward the serverless end. Predictive + reactive autoscaling together are the techniques that make the tiered abstraction behave more like the serverless one for the customer, without requiring the underlying architecture to change.

Per-cloud SKU mapping¶

An M30 on AWS is a different concrete SKU than an M30 on GCP than an M30 on Azure. MongoDB bears the pricing + compatibility mapping; the customer sees one M30. Economic mechanics:

"We charge customers according to their choices, including how many servers, what size, and how many hours they're running. Of course, we compensate our cloud providers (including AWS, Microsoft Azure, and Google Cloud) according to the number and size of servers."

Seen in¶

sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment — MongoDB Atlas's tier catalog (M10 / M20 / … / M60 / …) is the instance-size abstraction the predictive + reactive auto-scalers operate over; canonical wiki instance.

concepts/reactive-autoscaling — the one-tier-at-a-time constraint is intrinsic to reactive scaling on a tier catalog.
concepts/predictive-autoscaling — relaxes the constraint by supplying target tier from the forecast.
concepts/elasticity — the property the tier abstraction approximates.
systems/mongodb-atlas — canonical wiki instance.
systems/aws-ec2 — underlying cloud substrate.