Hasty Briefsbeta

Bilingual

LLMs Are Complicated Now

9 hours ago
  • #LLMs
  • #model architecture
  • #machine learning
  • Meta had two distinct machine learning branches: LLMs like Llama and complex recommendation systems, with the latter now simplified.
  • Modern LLMs feature diverse attention mechanisms, MoE, integrated vision/audio encoders, and multi-GPU inference, increasing complexity.
  • Recommendation systems evolved from simple two-tower networks to complex models balancing capability and efficiency.
  • Agents may not solve optimization issues; a reliable baseline is needed for verifying generated kernels.
  • Performance improvements are critical; small gaps between optimization and necessity require partially optimized variants for research.
  • FlexAttention in PyTorch exemplifies composable design, enabling exploration with minimal performance loss.
  • Andrej Karpathy emphasizes composability and architectural simplification as key to advancing AI research loops.