Hasty Briefsbeta

All in on MatMul? Don’t Put All Your Tensors in One Basket!

7 days ago
  • #Innovation
  • #AI Hardware
  • #Algorithmic Bias
  • The hardware lottery refers to AI ideas taking off based on alignment with dominant hardware/software rather than inherent superiority.
  • Modern chips favor DNNs and matrix multiplication (MatMul), skewing research towards these methods and potentially stifling alternative approaches.
  • AI-specific chips create technological inertia, making it difficult for non-MatMul-centric ideas to gain traction or fair evaluation.
  • The Matthew Effect is evident in AI: hardware favors certain algorithms, which then dominate, leading to more specialized chips for them.
  • Access to large clusters and capital dictates research directions, prioritizing short-term returns over wild new paradigms.
  • History shows that special-purpose machines often fail due to narrow markets and Moore’s Law economies of scale.
  • The dominance of MatMul-centric AI raises questions about whether the hardware lottery is still relevant or if it has become a planned economy.
  • Rich Sutton's 'bitter lesson' suggests that algorithms scaling best with compute (like DNNs) inevitably win, reinforcing the current paradigm.
  • Uniformity in hardware risks creating an innovation monoculture, potentially masking the need for alternative hardware paradigms.
  • Neurobiology suggests sparse, event-driven primitives, unlike dense MatMul-based AI, hinting at potential future breakthroughs.
  • Algorithmic breakthroughs (e.g., kD trees) can match decades of hardware advancements, highlighting the importance of software innovation.
  • Two strategies to address the hardware lottery: diversify hardware ecosystems or go all-in on existing successful hardware.
  • A middle path involves adding generality and programmability to specialized hardware, as seen with GPGPU evolution.
  • Joint hardware-algorithm co-design, using AI to discover efficient primitives, could break the chicken-and-egg problem of innovation.
  • Examples like multiplier-free accelerators (Stella Nera) show alternative compute substrates can be practical and efficient.
  • The true jackpot is hardware-algorithm pairs that unlock new computing eras, not just raw speed or specialization.