Hasty Briefsbeta

Building the largest known Kubernetes cluster, with 130k nodes

a day ago
  • #AI Workloads
  • #Scalability
  • #Kubernetes
  • Google Kubernetes Engine (GKE) successfully ran a 130,000-node cluster in experimental mode, doubling the previous limit.
  • Scaling involves not just nodes but also Pod creation, scheduling throughput, and distributed storage, sustaining 1,000 Pods per second.
  • AI workloads are driving demand for mega-clusters, with power constraints shifting focus to multi-cluster solutions like MultiKueue.
  • Key innovations include optimized read scalability with Consistent Reads from Cache and Snapshottable API Server Cache.
  • A proprietary key-value store based on Google’s Spanner database supports massive scale with 13,000 QPS for lease updates.
  • Kueue provides advanced job queueing, enabling workload prioritization and 'all-or-nothing' scheduling for AI/ML environments.
  • Future scheduling enhancements aim for workload-aware scheduling, moving from Pod-centric to workload-centric approaches.
  • GCS FUSE and Google Cloud Managed Lustre offer scalable, high-throughput data access for AI workloads.
  • A four-phase benchmark validated GKE’s performance, showing efficient preemption, scheduling, and elasticity under extreme loads.
  • GKE demonstrated stability with low latency, high throughput (1,000 Pods/s), and over 1 million objects in the database.