Our eighth generation TPUs: two chips for the agentic era
4 hours ago
- #AI Hardware
- #TPU
- #Google Cloud
- Google introduced the eighth generation TPUs, including TPU 8t for training and TPU 8i for inference, at Google Cloud Next.
- TPU 8t features massive scale with 9,600 chips per superpod, 121 ExaFlops compute, and near-linear scaling up to a million chips.
- TPU 8i is optimized for latency-sensitive inference, with innovations like breaking the 'memory wall' and doubling ICI bandwidth for MoE models.
- Both chips are co-designed with Google DeepMind and Axion ARM-based CPUs, offering up to 2x better performance-per-watt than the previous generation.
- The TPUs support frameworks like JAX, PyTorch, and vLLM, and will be available via Google’s AI Hypercomputer later this year.