Hasty Briefsbeta

Bilingual

How We Found 7 TiB of Memory Just Sitting Around

6 months ago
  • #Optimization
  • #Scalability
  • #Kubernetes
  • Kubernetes clusters with a high number of namespaces experience memory overhead and apiserver load due to listwatch operations.
  • Daemonsets exacerbate the issue by performing listwatch operations on every node, increasing memory usage and apiserver load.
  • Optimization efforts for Calico reduced memory usage, but Vector, another daemonset, was found to consume significant memory by listwatching namespaces.
  • A solution was identified by removing unnecessary namespace label checks in Vector, leading to a 50% memory reduction.
  • A configuration error was discovered where the fix was only applied to one of two kubernetes_logs sources, but correcting this led to significant memory savings.
  • The final fix resulted in a total memory reduction of 7 TiB across clusters, improving system efficiency and rollout stability.