All in on MatMul? Don’t Put All Your Tensors in One Basket!
7 days ago
- #Innovation
- #AI Hardware
- #Algorithmic Bias
- The hardware lottery refers to AI ideas taking off based on alignment with dominant hardware/software rather than inherent superiority.
- Modern chips favor DNNs and matrix multiplication (MatMul), skewing research towards these methods and potentially stifling alternative approaches.
- AI-specific chips create technological inertia, making it difficult for non-MatMul-centric ideas to gain traction or fair evaluation.
- The Matthew Effect is evident in AI: hardware favors certain algorithms, which then dominate, leading to more specialized chips for them.
- Access to large clusters and capital dictates research directions, prioritizing short-term returns over wild new paradigms.
- History shows that special-purpose machines often fail due to narrow markets and Moore’s Law economies of scale.
- The dominance of MatMul-centric AI raises questions about whether the hardware lottery is still relevant or if it has become a planned economy.
- Rich Sutton's 'bitter lesson' suggests that algorithms scaling best with compute (like DNNs) inevitably win, reinforcing the current paradigm.
- Uniformity in hardware risks creating an innovation monoculture, potentially masking the need for alternative hardware paradigms.
- Neurobiology suggests sparse, event-driven primitives, unlike dense MatMul-based AI, hinting at potential future breakthroughs.
- Algorithmic breakthroughs (e.g., kD trees) can match decades of hardware advancements, highlighting the importance of software innovation.
- Two strategies to address the hardware lottery: diversify hardware ecosystems or go all-in on existing successful hardware.
- A middle path involves adding generality and programmability to specialized hardware, as seen with GPGPU evolution.
- Joint hardware-algorithm co-design, using AI to discover efficient primitives, could break the chicken-and-egg problem of innovation.
- Examples like multiplier-free accelerators (Stella Nera) show alternative compute substrates can be practical and efficient.
- The true jackpot is hardware-algorithm pairs that unlock new computing eras, not just raw speed or specialization.