How Netflix Simplified Batch Compute with Kueue¶
Summary¶
Netflix replaced the custom queuing and scheduling logic in their homegrown managed batch solution, Compute Managed Batch (CMB), with Kueue — a cloud-native Kubernetes job queueing system. CMB had been built in 2018 before mature open-source batch offerings existed. As the Kubernetes ecosystem matured, features CMB provided (or aspired to) — fair sharing, hierarchical tenants, capacity management, priority queuing, preemption — became available in open-source projects. The team chose Kueue over YuniKorn and Volcano because Kueue does not replace the kube-scheduler, integrates with existing Titus scheduling profiles, and supports multi-tenant quota management over heterogeneous hardware. The migration (dubbed "Netflix Batch") handled millions of production batch workloads with zero user-facing changes, completing the production rollout in only 4 weeks.
Key Takeaways¶
-
CMB existed since 2018 — Netflix built a custom managed batch solution before Kubernetes-native alternatives matured. It handled workload submission, priority queuing, and capacity management atop Titus (Source: raw file, "Brief Overview of CMB and Titus" section).
-
Tenant hierarchy is the core abstraction — CMB uses internal tenants (tree organizers, no queues) and leaf tenants (accept work, have queues). Capacity is configured per-tenant with weight-based fair sharing across the tree (Source: raw file, "CMB Tenant Hierarchy" section).
-
Two capacity types: reserved and shared — Reserved capacity partitions resources exclusively; shared capacity is a global pool that any tenant can burst into. Under CMB, fair sharing only applied at admission (no preemption post-admission) (Source: raw file, "Reserved Capacity" / "Shared Capacity" sections).
-
Kueue chosen over YuniKorn and Volcano for key architectural reasons: (a) doesn't replace pod scheduling by kube-scheduler, preserving Titus scheduling profiles; (b) supports multi-tenant quota over heterogeneous hardware; (c) operates on native primitives (v1.Pod, batch/v1.Job) and higher-level abstractions (RayJob); (d) native preemption and all-or-nothing scheduling (Source: raw file, "Why Kueue?" section).
-
Transparent migration with zero user lift — The migration maintained API parity with CMB's existing interface. Under the hood, internal tenants map to Kueue Cohorts and leaf tenants to ClusterQueue + LocalQueue. Capacity configuration converts to resource flavors and nominal quotas (Source: raw file, "Migrating to Kueue" section).
-
Migrate the hardest customer first — Netflix deliberately enrolled their largest and most complex customer first, building confidence early and reducing the production migration to only 4 weeks (Source: raw file, "Lessons Learned" point 2).
-
QPS/burst tuning required — Kueue's default QPS, Burst, and groupKindConcurrency settings were insufficient for Netflix's throughput. This was derisked via load tests in a dev environment mimicking Titus (Source: raw file, "Lessons Learned" point 3).
-
Preemption-based fair sharing unlocks better utilization — With Kueue, reserved resources can be lent to other tenants when idle (
reclaimWithinCohort: Any) and reclaimed via preemption. Lower-priority workloads get preempted for higher-priority ones (withinClusterQueue: LowerPriority). This produced a significant increase in average resource utilization (Source: raw file, "Fair Sharing and Preemption" section). -
Titus federation abstracts cell topology — CMB (and now Netflix Batch) talks to a single Titus endpoint for workload submission and capacity reservation; federation routes to the correct underlying Kubernetes cluster. The new flow uses a custom "Kueue router" in Titus federation (Source: raw file, "Brief Overview of CMB and Titus" + "Netflix Batch User/Application Workload Submission Flow").
-
Future work: broader enrollment + training infra — Netflix plans to enroll more Titus batch workloads into the managed experience, and internal training teams are using learnings for Kubernetes-native training job scheduling (Source: raw file, "Current State of Kueue at Netflix").
Architectural Decisions¶
| Decision | Choice | Rationale |
|---|---|---|
| Replace CMB internals vs. new API | Replace internals, keep API | Derisks by unstacking bets; doesn't disrupt customers |
| Kueue vs. YuniKorn vs. Volcano | Kueue | Doesn't replace kube-scheduler; multi-tenant heterogeneous quota; native preemption |
| Migration order | Largest/most complex customer first | Builds confidence early; compresses overall timeline |
| Capacity semantics | Cohorts (internal) + ClusterQueue/LocalQueue (leaf) | Maps 1:1 to existing CMB tenant hierarchy |
Operational Numbers¶
- Millions of batch workloads managed by Kueue in production
- Production migration completed in 4 weeks
- Significant increase in average resource utilization after preemption-based fair sharing deployed
Caveats¶
- The post does not disclose exact cluster sizes, QPS numbers, or preemption latency SLAs.
- Fair-sharing semantics changed from admission-only (CMB) to continuous with preemption (Kueue) — a semantic shift users should understand.
- The Kueue router in Titus federation is custom Netflix code, not upstream Kueue.
Source¶
- Original: https://netflixtechblog.com/how-netflix-simplified-batch-compute-with-kueue-87860682629c?source=rss----2615bd06b42e---4
- Raw markdown:
raw/netflix/2026-06-22-how-netflix-simplified-batch-compute-with-kueue-c45923a3.md
Related¶
- systems/netflix-titus · systems/kueue · systems/netflix-batch
- concepts/fair-sharing · concepts/preemption · concepts/hierarchical-tenant-model
- concepts/workload-federation · concepts/multi-tenant-capacity-management
- concepts/reserved-vs-shared-capacity
- patterns/api-parity-migration · patterns/migrate-largest-customer-first
- patterns/transparent-migration · patterns/cohort-based-quota-hierarchy
- companies/netflix