Hasty Briefsbeta

Runc breaks pods when CPU requests aren't multiples of 10

14 days ago
  • #containerd
  • #cgroup
  • #kubernetes
  • Pod creation fails intermittently with CPU limit of 4096m due to non-deterministic calculation by containerd (409600 or 410000 microseconds).
  • runc consistently calculates 410000 microseconds, causing mismatch when containerd picks 409600, leading to kernel rejection.
  • Issue appears node-specific because nodes with containerd picking 409600 get stuck, failing all subsequent pod creations.
  • Investigation shows containerd's non-deterministic behavior in converting millicores to microseconds, differing from runc's consistent rounding.
  • Critical impact: Non-deterministic pod scheduling, broken nodes requiring manual intervention, and production issues on Amazon EKS clusters.
  • Root cause: Lack of consistency between containerd and runc in CPU quota calculations, needing deterministic behavior.