Hasty Briefsbeta

Bilingual

Our eighth generation TPUs: two chips for the agentic era

6 hours ago
  • #AI Hardware
  • #TPU
  • #Google Cloud
  • Google introduced the eighth generation TPUs, including TPU 8t for training and TPU 8i for inference, at Google Cloud Next.
  • TPU 8t features massive scale with 9,600 chips per superpod, 121 ExaFlops compute, and near-linear scaling up to a million chips.
  • TPU 8i is optimized for latency-sensitive inference, with innovations like breaking the 'memory wall' and doubling ICI bandwidth for MoE models.
  • Both chips are co-designed with Google DeepMind and Axion ARM-based CPUs, offering up to 2x better performance-per-watt than the previous generation.
  • The TPUs support frameworks like JAX, PyTorch, and vLLM, and will be available via Google’s AI Hypercomputer later this year.