CONCEPT Cited by 3 sources
Elasticity¶
Elasticity is the property that a service's capacity and performance expand and contract to customer demand without requiring the customer to forecast, provision, or negotiate — and without imposing sharp edges (quotas, rate-cliffs, separate tiers) that the developer has to architect around.
It is the architectural lever behind most "developer experience" improvements on foundational cloud services: making elasticity real is what lets the developer stop thinking about the service.
Two dimensions (per S3's framing, 2025)¶
Andy Warfield separates elasticity into capacity and performance:
- Capacity elasticity: "On S3, you never have to do up front provisioning of capacity or performance, and you don't worry about running out of space." No upfront sizing; no per-bucket capacity ceiling.
- Performance elasticity: "Any customer should be entitled to use the entire performance capability of S3, as long as it didn't interfere with others." Implemented via (a) transparent docs on request shape; (b) that shape baked into systems/aws-crt; (c) latency-class unlock via systems/s3-express-one-zone.
Both dimensions become invisible when they work — and both have a failure mode: "when we have aspects of the system that require extra work from developers, the lack of simplicity is distracting and time consuming."
(Source: sources/2025-03-14-allthingsdistributed-s3-simplicity-is-table-stakes)
Elasticity on compute (Lambda framing)¶
The Lambda PR/FAQ's version: scale range includes zero (concepts/scale-to-zero) and the same code path handles "one application invocation per month and 1,000 per second." Elasticity for compute implies no warm capacity assumption, which in turn forces multi-tenant packing (see concepts/micro-vm-isolation) so capacity is actually recyclable at low utilisation.
(Source: sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years)
What erodes elasticity¶
- Explicit quotas surfaced to the customer (S3's 100-buckets/account cap — later raised to up to 1M/account).
- Provisioning steps (pre-configure throughput, pre-declare concurrency).
- Tier cliffs where a workload exceeds one storage/latency class and must be migrated.
- Performance gotchas in the client that require custom retry / parallelization code. Fix: move the shape into a library.
The S3 post treats each of those as a simplification debt — the feature wasn't simple enough at launch and is being paid down over time. See concepts/simplicity-vs-velocity.
Predictive vs. reactive realisations¶
The elasticity ideal — "capacity perfectly matches demand at every moment" — can be approached from two directions in auto-scaling systems:
- Reactive: watch observed utilisation, act after it crosses a threshold. Always pays the detection-plus-action latency as observed tail.
- Predictive: forecast future demand, act before the load arrives. Hides scaling latency from the observed tail when the forecast is correct.
MongoDB's 2026-04-07 framing of the "imaginary perfect auto-scaling algorithm" is an articulation of elasticity as the theoretical target — "anticipate each customer's needs and perfectly scale their servers up and down, according to their changing demands." The predictive mechanism is the engineering approximation; the reactive mechanism is the backstop when the forecast fails. Together they move elasticity from "eventually right-sized" toward "right-sized throughout the cycle" — including at cost level: "save our customers money and reduce our carbon emissions" (Source: sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment).
On the tier-based managed-database axis ( MongoDB Atlas M10/M20/.../M60 catalog), elasticity has a lower ceiling than on serverless substrates (discrete tiers, not continuous), but predictive scaling moves between tiers with less observed-latency cost and skips intermediate tiers on sharp demand shifts — tighter approximation inside a discrete-catalog abstraction.
Seen in¶
- sources/2025-03-14-allthingsdistributed-s3-simplicity-is-table-stakes — "elasticity" as the core architectural property of S3, for both capacity and performance.
- sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years — elasticity for compute, including scale-to-zero as a day-one tenet.
- sources/2026-04-07-mongodb-predictive-auto-scaling-an-experiment — MongoDB Atlas's "imaginary perfect auto-scaling algorithm" articulation of elasticity as a theoretical target; predictive + reactive mechanisms composed to approximate it on a tier-catalog substrate.