Skip to content

CONCEPT Cited by 1 source

Preemption

Preemption is the mechanism by which a scheduler forcibly stops lower-priority or over-quota workloads to free resources for higher-priority or entitled workloads.

Definition

In batch compute systems, preemption enables dynamic rebalancing of resources based on priority or quota entitlement. Without preemption, admission-time decisions are final — once a job starts, it holds resources until completion regardless of changing demand. With preemption, the system can reclaim resources from running workloads.

Kueue preemption policies

Kueue supports two orthogonal preemption policies:

  1. reclaimWithinCohort: Any — reclaims resources lent to other ClusterQueues in the same Cohort when the owning queue needs them back. This preserves reservation semantics while allowing idle capacity to be borrowed.

  2. withinClusterQueue: LowerPriority — preempts lower-priority workloads within the same ClusterQueue to make room for higher-priority submissions.

apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
  name: "team-a-cq"
spec:
  preemption:
    reclaimWithinCohort: Any
    withinClusterQueue: LowerPriority

At Netflix

Netflix's legacy CMB system had no preemption — fair sharing applied only at admission. After adopting Kueue with preemption, Netflix saw a significant increase in average resource utilization because reserved capacity is now lent when idle and reclaimed on demand.

(Source: sources/2026-06-22-netflix-how-netflix-simplified-batch-compute-with-kueue)

Seen in

Last updated · 559 distilled / 1,651 read