Hasty Briefsbeta

Diagnosing a Linux Performance Regression

12 hours ago
  • #Linux
  • #Performance
  • #Kubernetes
  • Automattic uses Kubernetes for WordPress VIP applications with firewall rules for security.
  • A performance regression in the Linux kernel ipset module caused operations to run up to 1,000 times slower.
  • Monitoring scripts that usually took 2 seconds started taking over a minute due to slow iptables-save operations.
  • The issue was traced to getsockopt calls for ipset information taking significantly longer.
  • Kubernetes and kube-router use ipset for NetworkPolicy enforcement, managing over 6,000 sets per node.
  • ipset swap operations, crucial for kube-router's firewall updates, were found to be the bottleneck.
  • A kernel update introduced a regression where ip_set_swap operations slowed from microseconds to milliseconds.
  • The regression was fixed after reporting to the Linux Kernel Mailing List, with patches quickly released.