Hasty Briefsbeta

Bilingual

David Patterson: Challenges and Research Directions for LLM Inference Hardware

3 months ago
  • #AI Inference
  • #LLM
  • #Hardware Architecture
  • Large Language Model (LLM) inference is challenging due to the autoregressive Decode phase of Transformer models.
  • Primary challenges in LLM inference are memory and interconnect, not compute.
  • Four architecture research opportunities are highlighted: High Bandwidth Flash, Processing-Near-Memory, 3D memory-logic stacking, and low-latency interconnect.
  • The focus is on datacenter AI, but applicability for mobile devices is also reviewed.