Skip to content

CONCEPT Cited by 4 sources

Bin-packing

Bin-packing is the combinatorial-optimisation problem of placing a set of items with known sizes into the smallest number of fixed-capacity bins (or equivalently: maximising the efficiency with which a fixed set of bins is filled). Classical bin-packing is NP-hard in its general form. Cluster-scheduling variants are multi-dimensional (CPU, RAM, disk, network, GPU, TPU, affinity, anti-affinity, QoS), which only adds difficulty.

Bin-packing is the core decision primitive of almost every cluster scheduler the wiki touches:

  • Borg — Google's cluster manager; the 2025-07-29 RLM post names "a specialized bin-packing algorithm used to efficiently allocate tasks to resources" as the engine whose output the RLM predicts (Source: sources/2025-07-29-google-simulating-large-systems-with-regression-language-models).
  • Kubernetes scheduler — its scoring / filtering loop is a multi-dimensional bin-packer over pod resource requests.
  • Karpenter — AWS's K8s-native autoscaler picks instance types by solving a bin-packing problem over pending pods vs. instance SKUs.
  • EKS Auto Mode — encapsulates a Karpenter-style bin-packer behind a managed-data-plane API.
  • Apache Mesos, HashiCorp Nomad, YARN, Slurm — all variants of the same primitive.

Why it's expensive at scale

The scheduler doesn't just solve one bin-packing instance — it re-solves as jobs arrive, fail, scale, or get preempted, and it considers many candidate placements per decision. At Google-fleet scale, running the full combinatorial solver on every candidate is what motivates replacing it with a cheap ML approximator: a predictor that estimates the outcome of the bin-packer is orders of magnitude cheaper per query than actually running it.

As a prediction target (not just a solver)

The 2025-07-29 RLM work treats bin-packing differently from most scheduler literature: the bin-packer is not the thing being improved — it's the thing being predicted. The RLM's target is the bin-packer's numeric output (MIPS per GCU) on a given cluster state. This is a slightly unusual framing that shows up when:

  • The authoritative solver is too slow to run in the inner loop.
  • You need a fast answer that tracks what the slow solver would have said.
  • Uncertainty-flagged cases can fall back to the slow solver.

In that framing, bin-packing is the performance-prediction target function, not a design concern of the predictor itself.

Standard approximation strategies (the solver side)

Outside the RLM framing, bin-packing at cluster scale uses well-known heuristic approximators:

  • First Fit / Best Fit / Worst Fit / Next Fit Decreasing — classical O(n log n) heuristics with approximation-ratio guarantees.
  • Linear relaxation + rounding — solve the LP relaxation, round to integers.
  • Score-based placement — score each (pod, node) pair by a weighted sum of fit metrics; pick the highest-scoring candidate. Kubernetes scheduler canonical.
  • Bin-packing-aware autoscaling — Karpenter's "pick an instance type that maximally absorbs the pending pods."

The 2025-07-29 post doesn't describe Borg's specific bin-packing algorithm — only that it's "specialised" and used inside the digital-twin backtester to generate the MIPS-per-GCU training signal.

Online bin-packing with unknown disappearance times

Cloud VM allocation is a special instance: online bin-packing where the pieces appear and disappear at times unknown at placement. The 2025-10-17 Google LAVA post opens with the framing directly — "a puzzle game similar to Tetris with pieces rapidly falling onto a stack. Some fit perfectly. Others don't… the 'pieces' (or VMs) appear and disappear, some with a lifespan of only minutes, and others, days" (Source: sources/2025-10-17-google-solving-virtual-machine-puzzles-lava).

This variant has two named operational failure modes distinct from static bin-packing:

  • Resource stranding — remaining per-host resources too small or unbalanced to host any candidate new piece; the host is not empty but the leftover is unusable.
  • Empty-host loss — the cluster's reserve of fully-empty hosts (needed for maintenance + large- VM provisioning) erodes when placement ignores lifetime.

Both are second-order effects of placement decisions made without lifetime information. The LAVA family's answer is ML-based lifetime prediction with continuous reprediction + learned lifetime distributions driving LARS-style rebalancing (see systems/lava-vm-scheduler).

Bin-packing over time vs over space (2026-02-11)

Google Research's 2026-02-11 scheduling-under-time-varying- capacity post introduces the over-time sibling of the over-space online bin-packing captured in the 2025-10-17 LAVA post. Where LAVA allocates VMs into host capacity (CPU / RAM / disk are the dimensions), the 2026-02-11 paper allocates batch jobs into a wall-clock capacity profile (time is the dimension) (Source: sources/2026-02-11-google-scheduling-in-a-changing-world-time-varying-capacity).

Both share the "commit under uncertainty about future arrivals" hazard and both are online. The primitives differ:

The two papers together pin "scheduling-as-online-bin- packing" as a two-dimensional shape — over-space allocation with lifetime uncertainty (Tetris where pieces disappear) and over-time allocation with capacity variation (schedule jobs into a varying time budget). Both appear on Google's production Borg substrate; the over-time paper is a theoretical contribution while the over-space paper has a deployment-shaped narrative.

Seen in

Last updated · 200 distilled / 1,178 read