Skip to content

GOOGLE 2026-02-11 Tier 1

Read original ↗

Google Research — Scheduling in a changing world: Maximizing throughput with time-varying capacity

Summary

Google Research post (2026-02-11) on online throughput- maximising scheduling under a time-varying machine-capacity profile — the algorithmic-theory problem underlying batch cluster schedulers that must commit to job placements without knowing what future jobs will arrive. Core result-shape: with no preemption, the competitive ratio of any online algorithm approaches zero — a single long-job commitment can foreclose arbitrarily many future short jobs; with interrupt- and-restart preemption (discard partial work, the job stays in the queue), a simple schedule-the-job-that- finishes-earliest greedy recovers the offline ½- competitive bound; with no-restart preemption (discarded jobs are lost forever), competitive ratio again collapses to zero in general — but restores to a constant under the production-grade assumption that all jobs share a common deadline (e.g. "all data processing must finish by the nightly batch run"). For the common-deadline unit-capacity variant, the authors devise a novel four-action algorithm that maintains a tentative schedule over already-arrived jobs and updates it on each arrival. Theoretical paper from Google Research; connects directly to Borg-style cluster-scheduling problems where "completing many short jobs is much more profitable than completing one long job" is the exact economic shape of the scheduler's objective.

⚠️ Raw-file scope note. The local raw scrape (raw/google/2026-02-11-scheduling-in-a-changing-world-maximizing-throughput-with-ti-5a3ce6ec.md, ~29 lines) captures only the online-setting section — the offline-setting results, the introduction motivating time-varying capacity profiles, the full four-action algorithm specification, general (non-unit-capacity) extensions, approximation-ratio numbers for the common- deadline algorithm, and the paper's overall contribution list are referenced but not fully captured. Wiki pages are scoped to what the raw verifiably contains; deeper details marked as gaps throughout.

Key takeaways

  1. Online throughput scheduling has a non-trivial competitive-ratio landscape — preemption policy is the load-bearing design knob. "The standard non-preemptive algorithms fail completely here as their competitive ratio approaches zero" under no preemption. Adding interrupt-with-restart recovers the offline ½-competitive bound. Restricting to interrupt-without-restart collapses back to zero-competitive in general. The three regimes are not close to each other — the choice of preemption semantics is the first-order architectural decision (Source: sources/2026-02-11-google-scheduling-in-a-changing-world-time-varying-capacity).

  2. A single long-job commit can destroy arbitrarily much future throughput. "A single bad decision of scheduling a long job can ruin the possibility of scheduling many future smaller jobs. … If you imagine that each completed job brings equal weight, regardless of its length, completing many short jobs is much more profitable than completing one long job." This is the mechanism that drives the non-preemptive lower bound to zero: the adversary presents one long job first, the scheduler must commit, then a burst of short jobs arrives and can't be scheduled. This is the exact shape of head-of-line blocking at the job-scheduler layer — the theoretical sibling of TCP head-of-line blocking and storage-queue HOL (Source: sources/2026-02-11-google-scheduling-in-a-changing-world-time-varying-capacity).

  3. Interrupt-and-restart preemption is "highly beneficial" and matches the offline bound. "A variant of Greedy that iteratively schedules the job that finishes earliest continues to achieve a 1/2-competitive ratio, matching the result in the offline setting." The primitive being evaluated: "only jobs restarted and later completed non- preemptively count as successful" — partial-work does not count, so the policy is genuinely [[patterns/interrupt- and-restart|interrupt-and-restart]]-semantics, not checkpoint-and-resume. Constant-competitive bounds mean the algorithm never does worse than a fixed factor below the offline optimum — a guarantee the production scheduler can reason about (Source: sources/2026-02-11-google-scheduling-in-a-changing-world-time-varying-capacity).

  4. Interrupt-without-restart is adversarially unwinnable in general — but tractable under common deadlines. "In this stricter model, all work performed on the interrupted job is lost and the job itself is discarded forever. Unfortunately, we find that in this strict model, any online algorithm can encounter a sequence of jobs that forces it into decisions which prevent it from satisfying much more work in the future. Once again, the competitive ratio of all online algorithms approaches zero." The paper's escape from this impossibility: restrict to the practical case of "all jobs share a common deadline (e.g., all data processing must finish by the nightly batch run)", for which they "devise novel constant competitive algorithms" (Source: sources/2026-02-11-google-scheduling-in-a-changing-world-time-varying-capacity).

  5. Production framing is batch-cluster-nightly-deadline — the theory maps onto Borg-adjacent scheduling. The named motivating example for the common-deadline model is "all data processing must finish by the nightly batch run" — the exact shape of a nightly ETL / batch-job- deadline environment where every job must land before a shared wall-clock cutoff. The "time-varying capacity" framing in the title also matches the real behaviour of cluster machines — diurnal interactive-load + spot- preemption + hardware failures continuously change the capacity profile available to batch jobs. This is scheduling theory explicitly targeted at Borg-style production environments, not an abstract mathematical exercise (Source: sources/2026-02-11-google-scheduling-in-a-changing-world-time-varying-capacity).

  6. The four-action tentative- schedule algorithm is the concrete deliverable for common-deadline / unit-capacity. On each job arrival, the algorithm "maintains a tentative schedule by assigning the jobs that have already arrived to disjoint time intervals" and "modifies the tentative schedule by taking the first applicable action out of the following four actions". Structurally this is a speculative planner over the queue — not greedy commit on arrival — that repeatedly revises the committed plan as new jobs land. Tentative-schedule-then-revise is a broadly- applicable pattern: same shape as LARS rescheduling (Google 2025-10-17), same shape as TCP congestion window tentative-update-on-ack, same shape as any optimistic- concurrency planner (Source: sources/2026-02-11-google-scheduling-in-a-changing-world-time-varying-capacity).

  7. The raw capture is fragment-only — full four-action specification, general-capacity extensions, constant values, approximation numbers, and offline-setting results are not in scope. Content captured: introduction of competitive-ratio model + three preemption regimes + common-deadline framing + first two sentences of the four-action algorithm description. Not captured: the four actions themselves, the general (non-unit-capacity) case, the offline-setting results referenced as "matching the result in the offline setting", the specific constant in the common-deadline constant-competitive bound, any production-deployment-on-Borg details, and the paper's contribution list. Wiki pages are scoped to what the raw verifiably contains; deeper specifics marked as gaps (Source: sources/2026-02-11-google-scheduling-in-a-changing-world-time-varying-capacity).

Operational numbers

  • No-preemption competitive ratio: "approaches zero" as input size grows — in the adversarial worst case, an online non-preemptive algorithm can throughput arbitrarily less than the offline optimum.
  • Interrupt-with-restart competitive ratio: ½ — achieved by the "schedule the job that finishes earliest" greedy variant, matching the offline bound.
  • Interrupt-without-restart (general case) competitive ratio: "approaches zero" — adversarially unwinnable.
  • Interrupt-without-restart with common deadlines: "novel constant competitive algorithms" — specific constant is in the paper, not in the raw capture.
  • Algorithm detail captured: the first two sentences of the four-action tentative-schedule algorithm for unit-capacity common-deadline; the four actions themselves are not captured.

Caveats

  • Theoretical paper, not a production-deployment retrospective. The raw is pitched as research-side algorithmic theory with competitive-ratio analysis — there's no disclosure of whether the algorithm is deployed on Borg, what production savings look like, or how it interacts with existing Borg scheduler components (the LAVA family or the RLM). The production connection is the motivating-example framing, not a production-proof-point claim.
  • Short raw capture. Only the online-setting portion is scraped; the offline setting, algorithm details, and production numbers are referenced but not captured. A full ingest of the linked paper would materially expand the wiki's coverage of this work; current pages are scoped to what the raw verifiably contains.
  • "Jobs have equal weight, regardless of length" assumption. The key-takeaway-2 framing of "each completed job brings equal weight" is the specific throughput objective the paper optimises — this is the unit-throughput / count-jobs-completed setting, not a weighted-throughput or SLA-aware setting. Real cluster schedulers often have weighted priorities, QoS classes, and SLO deadlines per job; the theoretical result applies to the unit-weight variant captured here.
  • Time-varying capacity is in the title but not quantitatively developed in the raw capture. The raw captures the online-setting preemption analysis but does not develop the time-varying capacity framing beyond the title. The wiki page for online scheduling treats time-varying capacity as a context for the captured results rather than as an independently-analysed dimension.

Systems introduced

None. The paper is an algorithmic-theory contribution without an independently-named system artifact in the raw capture.

Systems extended

  • systems/borg — the production target implied by the "all data processing must finish by the nightly batch run" framing. Scheduling-theory angle complements the existing Borg entries: 2025-07-29 RLM (bin-packer-output prediction) + 2025-10-17 LAVA / NILAS / LARS (VM-allocation policy), now with a third Google Research ML-adjacent- theory proof point at the online-throughput layer — competitive-ratio analysis of preemption semantics for batch-job scheduling with common deadlines. Production deployment not disclosed in the raw capture.

Concepts introduced

  • concepts/competitive-ratio"the worst case comparison between the throughput of our online algorithm and the throughput of an optimal algorithm that is aware of all the jobs apriori". The load-bearing quality metric for online-algorithm analysis; a constant-competitive algorithm never does worse than a fixed factor below the offline optimum regardless of input sequence.
  • concepts/online-scheduling — the problem class where "jobs arrive dynamically and the scheduler must make immediate, irrevocable decisions without knowing what jobs will arrive next". Online scheduling differs from offline scheduling in that the scheduler sees the input one job at a time; adversarial analysis characterises the worst possible input sequence.
  • concepts/common-deadline-scheduling — the practical restriction where "all jobs share a common deadline (e.g., all data processing must finish by the nightly batch run)". Used to dodge the adversarial impossibility of interrupt-without-restart scheduling in general — by constraining deadlines to a single wall-clock cutoff, constant-competitive algorithms become feasible.
  • concepts/tentative-schedule — the algorithmic primitive of "assigning the jobs that have already arrived to disjoint time intervals" as a speculative plan, then revising it on each arrival via a fixed sequence of update actions. Distinct from greedy-commit-on-arrival: the commitment is revisable until execution begins.

Concepts extended

  • concepts/bin-packing — adds a sibling-angle to the existing bin-packing content: where the LAVA / NILAS / LARS work (2025-10-17) is online bin-packing in space (CPU / RAM / disk capacity), this paper is online scheduling in time (capacity profile varying over wall-clock). Both share the "commit under uncertainty about future arrivals" hazard, and both use similar primitives (learned-lifetime distributions over the future in LAVA; tentative-schedule revision in this paper). The two Google Research papers pin bin-packing-over-time and bin-packing-over-space as sibling sub-genres within the Borg-adjacent scheduling theory.

Patterns introduced

  • patterns/interrupt-and-restart — preemption semantics where an active job can be interrupted; the partial work is lost, but the job returns to the queue and can be retried. The paper's key-takeaway-3 result: interrupt-and-restart is "highly beneficial" — matches the offline ½-competitive bound with a simple greedy. Contrast the stricter interrupt-without-restart semantics (discard job entirely), which is adversarially unwinnable in general.
  • patterns/earliest-finish-job-greedy — the specific greedy variant that "iteratively schedules the job that finishes earliest". Achieves ½-competitive under interrupt-and-restart; intuition: always committing to the shortest-remaining-time job maximises the rate at which jobs complete, and the interrupt-and-restart escape hatch means a suboptimal commitment can be retracted once a shorter job arrives.

Patterns extended

  • patterns/lifetime-aware-rescheduling — the LARS pattern from Google 2025-10-17. Sibling pattern to tentative-schedule-with-revision in this paper: both are "commit to a plan, keep planning, revise on new information" rather than "commit greedily on arrival". LARS revises via learned-lifetime distributions; the tentative-schedule algorithm revises via a fixed four-action rule on new job arrivals. Both are Google Research scheduling-theory interventions on Borg-adjacent workloads at different layers.

Source

Last updated · 178 distilled / 1,178 read