CONCEPT Cited by 1 source

Aleatoric uncertainty¶

Aleatoric uncertainty is the component of prediction uncertainty that comes from inherent randomness in the system being modelled — not from limited training data, not from model architecture, not from feature gaps. It's the irreducible noise floor: no matter how much more data you collect or how much better your model gets, aleatoric uncertainty doesn't go down.

Contrast: epistemic uncertainty is reducible — it comes from the model not having seen enough similar examples or the features being too weak to separate outcomes. Collecting more of the right data shrinks epistemic uncertainty; it doesn't touch aleatoric.

Where it shows up in systems¶

Stochastic load demand on a cluster — user traffic arrives at random times, with random sizes; the next minute's load is genuinely unpredictable at the sub-minute grain. Named as an aleatoric-uncertainty source by Google in the 2025-07-29 RLM post (Source: sources/2025-07-29-google-simulating-large-systems-with-regression-language-models).
Hardware thermals, contention, microarchitectural noise — the same job on the same machine runs at slightly different speeds each time.
Adversarial / uncontrolled inputs — ad-click rates, fraud-detection signals, trade executions.
Measurement noise in the telemetry itself.

Why it matters operationally¶

It sets the floor for downstream planning. A scheduler that tries to allocate exactly the right capacity for next minute's load cannot be more accurate than the aleatoric noise in the demand process. Planning for aleatoric noise means over-provisioning headroom, not training a better model.
It's a legitimate reason for fast-path fallback. When the predicted distribution is wide because of real stochasticity in the outcome, pushing to an authoritative slow solver may or may not help (the slow solver faces the same noise) — but knowing why it's wide is what lets you make the call.
It bounds the ceiling of improvement. If you observe your model still has substantial uncertainty after lots of data and feature work, some of that is aleatoric and retraining won't fix it — consider designing around it (redundancy, headroom, retry) instead.

Why the RLM captures it naturally¶

Text-to-text regression with sampled decodes approximates P(y | x). If the true P(y | x) has positive variance because of aleatoric noise — same (x), different (y) on different runs — then:

The training target distribution itself has that spread.
Multiple (x, y) pairs with the same x and different y appear in training.
The model learns to emit broader sampled distributions on those inputs.

The RLM doesn't distinguish aleatoric from epistemic by construction — both show up as distribution width — but the decomposition is available post-hoc by comparing distribution width across inputs with similar feature coverage.

Distinct from¶

Epistemic uncertainty — reducible, data-dependent.
Model misspecification — the model family can't express the true relationship; neither aleatoric nor epistemic, a third failure mode.
Out-of-distribution inputs — the model hasn't seen anything like this; surfaces as epistemic uncertainty if the model is well-calibrated.

Seen in¶

sources/2025-07-29-google-simulating-large-systems-with-regression-language-models — named explicitly as one of the two uncertainty types the RLM captures; "inherent randomness in the system, like stochastic load demand".