SYSTEM Cited by 1 source

LARS (VM scheduler)¶

LAVA is a trio of lifetime-aware VM-allocation algorithms from Google Research (2025-10-17 announcement; backing paper arXiv:2412.09840v1):

NILAS — Non-Invasive Lifetime Aware Scoring. Placement scoring layer that adds lifetime-awareness without modifying the underlying allocation algorithm. Read-only on the allocator's decision surface; easy to deploy as a ranking signal on top of an existing placement pipeline.
LAVA — Lifetime-Aware VM Allocation. Full allocation algorithm using learned lifetime distributions and with built-in adaptation to misprediction. Changes the placement decision itself, not just its scoring.
LARS — Lifetime-Aware Rescheduling. Post-placement rescheduling layer that continues to track lifetime predictions and migrates VMs when the updated picture makes the existing placement inefficient. The LARS pattern is this component's generalisation.

(Source: sources/2025-10-17-google-solving-virtual-machine-puzzles-lava)

Problem class¶

Cloud VM allocation is online bin-packing with unknown disappearance times — Tetris where the pieces fall, fit, and then vanish at unpredictable future times. Two operational failure modes emerge from placement decisions that ignore lifetime:

Resource stranding — server's remaining resources too small or unbalanced to host any candidate new VM.
Empty-host loss — too few fully-empty hosts to satisfy system-maintenance and large-VM-provisioning requirements.

The LAVA trio targets both failure modes simultaneously via lifetime-awareness at three insertion points — scoring (NILAS), allocation (LAVA), rescheduling (LARS).

The structural failure mode of naive ML lifetime prediction¶

The 2025-10-17 post names the hazard explicitly: "AI can help with this problem by using learned models to predict VM lifetimes. However, this often relies on a single prediction at the VM's creation. The challenge with this approach is that a single misprediction can tie up an entire host for an extended period, degrading efficiency" (Source: sources/2025-10-17-google-solving-virtual-machine-puzzles-lava).

The LAVA family's answer is continuous reprediction — the model constantly updates its prediction of a VM's expected remaining lifetime as the VM continues to run, so early-stage mispredictions are recoverable at later-stage prediction windows.

Why three algorithms, not one¶

The three names encode three insertion points with three deployment profiles:

Layer	Insertion point	Decision scope	Deployment risk
NILAS	Scoring	Read-only ranking signal	Low (additive hint)
LAVA	Allocation	Changes placement decision	Medium (changes behavior)
LARS	Rescheduling	Migrates already-placed VMs	High (disruptive)

A production deployment can roll out in order — NILAS as a scoring signal only, then LAVA as a full allocator change, then LARS as the migration-triggering rescheduler — with each layer providing independent value and increasing the operational ambition.

Position relative to other wiki ML-for-systems work¶

systems/borg / RLM (2025-07-29) — predicts the bin-packer's output (MIPS per GCU from the Borg digital twin); cheap approximator for a slow authoritative solver. Different insertion point: output prediction vs. policy intervention.
patterns/cheap-approximator-with-expensive-fallback — sibling deployment pattern; both families use calibrated uncertainty as the control signal, but LAVA's "fallback" is "reschedule later when picture updates", not "run the slow solver now".

Editorial framing: the two posts pair naturally as "two Google Research ML-for-systems proof points on Borg-adjacent infrastructure, at different layers". The 2025-10-17 post doesn't cross-reference the 2025-07-29 post explicitly; the wiki surfaces the pairing.

Deployment status¶

Not disclosed in the raw capture. The 2025-10-17 post's intro framing does not specify whether the algorithms are production-deployed on Google Cloud, internal Borg, or paper-only. The backing arXiv paper is the canonical reference for production-ness claims.

What the wiki doesn't yet have¶

NILAS's specific scoring formula.
LAVA's learned-distribution representation (parametric, quantile, histogram, mixture, …).
LARS's rescheduling-trigger logic (confidence threshold, efficiency-delta threshold, migration cost model).
Measured production efficiency numbers (stranding reduction %, empty-host preservation %, efficiency vs. single-shot-prediction baseline).
Borg integration specifics (insertion in the existing scheduler pipeline, rollout shape, online-learning loop).
Relationship to historical Borg scheduler features that already encode priority / churn tolerance.

All deferred to the arXiv paper.

Seen in¶

sources/2025-10-17-google-solving-virtual-machine-puzzles-lava — canonical and only wiki source; introduces the algorithmic family, its motivation, and the continuous-reprediction primitive, without describing internal mechanisms.

systems/borg
concepts/bin-packing
concepts/vm-lifetime-prediction
concepts/continuous-reprediction
concepts/learned-lifetime-distribution
concepts/resource-stranding
concepts/empty-host
concepts/performance-prediction
concepts/uncertainty-quantification
patterns/lifetime-aware-rescheduling
patterns/learned-distribution-over-point-prediction
patterns/cheap-approximator-with-expensive-fallback
systems/regression-language-model — sibling ML-for-systems intervention on Borg at a different layer (2025-07-29).
companies/google