SYSTEM Cited by 1 source
LAVA / NILAS / LARS (VM scheduler)¶
LAVA is a trio of lifetime-aware VM-allocation algorithms from Google Research (2025-10-17 announcement; backing paper arXiv:2412.09840v1):
- NILAS — Non-Invasive Lifetime Aware Scoring. Placement scoring layer that adds lifetime-awareness without modifying the underlying allocation algorithm. Read-only on the allocator's decision surface; easy to deploy as a ranking signal on top of an existing placement pipeline.
- LAVA — Lifetime-Aware VM Allocation. Full allocation algorithm using learned lifetime distributions and with built-in adaptation to misprediction. Changes the placement decision itself, not just its scoring.
- LARS — Lifetime-Aware Rescheduling. Post-placement rescheduling layer that continues to track lifetime predictions and migrates VMs when the updated picture makes the existing placement inefficient. The LARS pattern is this component's generalisation.
(Source: sources/2025-10-17-google-solving-virtual-machine-puzzles-lava)
Problem class¶
Cloud VM allocation is online bin-packing with unknown disappearance times — Tetris where the pieces fall, fit, and then vanish at unpredictable future times. Two operational failure modes emerge from placement decisions that ignore lifetime:
- Resource stranding — server's remaining resources too small or unbalanced to host any candidate new VM.
- Empty-host loss — too few fully-empty hosts to satisfy system-maintenance and large-VM-provisioning requirements.
The LAVA trio targets both failure modes simultaneously via lifetime-awareness at three insertion points — scoring (NILAS), allocation (LAVA), rescheduling (LARS).
The structural failure mode of naive ML lifetime prediction¶
The 2025-10-17 post names the hazard explicitly: "AI can help with this problem by using learned models to predict VM lifetimes. However, this often relies on a single prediction at the VM's creation. The challenge with this approach is that a single misprediction can tie up an entire host for an extended period, degrading efficiency" (Source: sources/2025-10-17-google-solving-virtual-machine-puzzles-lava).
The LAVA family's answer is continuous reprediction — the model constantly updates its prediction of a VM's expected remaining lifetime as the VM continues to run, so early-stage mispredictions are recoverable at later-stage prediction windows.
Why three algorithms, not one¶
The three names encode three insertion points with three deployment profiles:
| Layer | Insertion point | Decision scope | Deployment risk |
|---|---|---|---|
| NILAS | Scoring | Read-only ranking signal | Low (additive hint) |
| LAVA | Allocation | Changes placement decision | Medium (changes behavior) |
| LARS | Rescheduling | Migrates already-placed VMs | High (disruptive) |
A production deployment can roll out in order — NILAS as a scoring signal only, then LAVA as a full allocator change, then LARS as the migration-triggering rescheduler — with each layer providing independent value and increasing the operational ambition.
Position relative to other wiki ML-for-systems work¶
- systems/borg / RLM (2025-07-29) — predicts the bin-packer's output (MIPS per GCU from the Borg digital twin); cheap approximator for a slow authoritative solver. Different insertion point: output prediction vs. policy intervention.
- patterns/cheap-approximator-with-expensive-fallback — sibling deployment pattern; both families use calibrated uncertainty as the control signal, but LAVA's "fallback" is "reschedule later when picture updates", not "run the slow solver now".
Editorial framing: the two posts pair naturally as "two Google Research ML-for-systems proof points on Borg-adjacent infrastructure, at different layers". The 2025-10-17 post doesn't cross-reference the 2025-07-29 post explicitly; the wiki surfaces the pairing.
Deployment status¶
Not disclosed in the raw capture. The 2025-10-17 post's intro framing does not specify whether the algorithms are production-deployed on Google Cloud, internal Borg, or paper-only. The backing arXiv paper is the canonical reference for production-ness claims.
What the wiki doesn't yet have¶
- NILAS's specific scoring formula.
- LAVA's learned-distribution representation (parametric, quantile, histogram, mixture, …).
- LARS's rescheduling-trigger logic (confidence threshold, efficiency-delta threshold, migration cost model).
- Measured production efficiency numbers (stranding reduction %, empty-host preservation %, efficiency vs. single-shot-prediction baseline).
- Borg integration specifics (insertion in the existing scheduler pipeline, rollout shape, online-learning loop).
- Relationship to historical Borg scheduler features that already encode priority / churn tolerance.
All deferred to the arXiv paper.
Seen in¶
- sources/2025-10-17-google-solving-virtual-machine-puzzles-lava — canonical and only wiki source; introduces the algorithmic family, its motivation, and the continuous-reprediction primitive, without describing internal mechanisms.
Related¶
- systems/borg
- concepts/bin-packing
- concepts/vm-lifetime-prediction
- concepts/continuous-reprediction
- concepts/learned-lifetime-distribution
- concepts/resource-stranding
- concepts/empty-host
- concepts/performance-prediction
- concepts/uncertainty-quantification
- patterns/lifetime-aware-rescheduling
- patterns/learned-distribution-over-point-prediction
- patterns/cheap-approximator-with-expensive-fallback
- systems/regression-language-model — sibling ML-for-systems intervention on Borg at a different layer (2025-07-29).
- companies/google