CONCEPT Cited by 1 source
Resource stranding¶
Resource stranding is the failure mode where a server's remaining resources are too small or unbalanced to host any candidate workload, effectively wasting capacity. The host is not empty — something is running on it — but the remainder (free CPU × free RAM × free disk × free NIC × free accelerator) doesn't match the shape of any job waiting to be placed. The bin is partially full and no remaining piece fits.
Named directly in Google Research's 2025-10-17 LAVA post: "Poor VM allocation can lead to 'resource stranding', where a server's remaining resources are too small or unbalanced to host new VMs, effectively wasting capacity" (Source: sources/2025-10-17-google-solving-virtual-machine-puzzles-lava).
Two distinct shapes¶
- Absolute stranding. Remaining resources are below the minimum any candidate VM shape requires. E.g. 1 GB RAM free on a host — no VM shape in the catalogue fits.
- Shape stranding. Remaining resources are adequate in aggregate but imbalanced relative to the catalogue's VM shapes. E.g. 32 GB RAM free but only 1 vCPU free — no commonly-requested shape has a 32:1 RAM:CPU ratio. The leftover is unusable even though there's plenty of it.
Shape stranding is the harder problem at cloud scale: VM shapes come from a discrete catalogue, so leftover shapes outside the catalogue can't be "merged" or "resized" into the catalogue without rebalancing existing placements.
Economic weight¶
At data-center scale, a few percentage points of stranded capacity is real money. The 2025-10-17 post pairs stranding with loss of empty hosts as the two failure modes the LAVA family targets — both are second-order consequences of placement decisions made without knowing how long each placed workload will run.
Why lifetime prediction¶
changes the picture
If the scheduler knew lifetimes, it could co-place similar-duration workloads so hosts drain at similar rates — drained hosts rebalance cleanly, avoid leaving awkward fragments. Without lifetime information, the scheduler commits to placements that look efficient at t=0 but strand capacity at t=k because long-lived and short-lived workloads intermix on the same host.
The LAVA family's pitch is: continuous reprediction + lifetime distributions let the scheduler both co-place by expected duration and rebalance via LARS when the remaining-lifetime picture updates.
Seen in¶
- sources/2025-10-17-google-solving-virtual-machine-puzzles-lava — canonical wiki instance; resource stranding named as one of the two operational failure modes the LAVA / NILAS / LARS algorithmic family is designed to reduce, alongside loss of empty hosts.