CONCEPT Cited by 1 source
Preemption¶
Preemption is the mechanism by which a scheduler forcibly stops lower-priority or over-quota workloads to free resources for higher-priority or entitled workloads.
Definition¶
In batch compute systems, preemption enables dynamic rebalancing of resources based on priority or quota entitlement. Without preemption, admission-time decisions are final — once a job starts, it holds resources until completion regardless of changing demand. With preemption, the system can reclaim resources from running workloads.
Kueue preemption policies¶
Kueue supports two orthogonal preemption policies:
-
reclaimWithinCohort: Any— reclaims resources lent to other ClusterQueues in the same Cohort when the owning queue needs them back. This preserves reservation semantics while allowing idle capacity to be borrowed. -
withinClusterQueue: LowerPriority— preempts lower-priority workloads within the same ClusterQueue to make room for higher-priority submissions.
apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
name: "team-a-cq"
spec:
preemption:
reclaimWithinCohort: Any
withinClusterQueue: LowerPriority
At Netflix¶
Netflix's legacy CMB system had no preemption — fair sharing applied only at admission. After adopting Kueue with preemption, Netflix saw a significant increase in average resource utilization because reserved capacity is now lent when idle and reclaimed on demand.
(Source: sources/2026-06-22-netflix-how-netflix-simplified-batch-compute-with-kueue)
Seen in¶
- sources/2026-06-22-netflix-how-netflix-simplified-batch-compute-with-kueue — preemption-based fair sharing enabling better utilization of reserved capacity.