Netflix Simplified Batch Compute with Kueue
a day ago
- #Batch Compute
- #Netflix
- #Kubernetes
- Netflix migrated from a custom batch solution (CMB) to Kubernetes-native Kueue for batch workloads.
- CMB used a tenant hierarchy with reserved and shared capacity, but lacked features like preemption.
- Kueue was chosen for its compatibility with existing scheduling, multi-tenant support, and built-in features like preemption.
- The migration to Netflix Batch was transparent to users, involved converting tenants to Kueue resources, and prioritized complex use cases early.
- Kueue improved resource utilization through fair sharing and preemption, managing millions of workloads in production.