New AI architecture is 100x faster in reasoning with just 1k training examples
9 months ago
- #AI
- #Machine Learning
- #Neuroscience
- Sapient Intelligence, a Singapore-based AI startup, developed the Hierarchical Reasoning Model (HRM), a new AI architecture that outperforms large language models (LLMs) in complex reasoning tasks while being smaller and more data-efficient.
- HRM is inspired by the human brain's use of distinct systems for slow, deliberate planning and fast, intuitive computation, achieving impressive results with less data and memory than LLMs.
- Current LLMs rely on chain-of-thought (CoT) prompting, which has limitations like brittleness and high data requirements, leading to long, slow responses.
- HRM uses 'latent reasoning,' reasoning internally without explicit language generation, similar to human thought processes, avoiding issues like vanishing gradients and early convergence.
- HRM consists of two coupled modules: a high-level (H) module for abstract planning and a low-level (L) module for fast computations, enabling deep reasoning without extensive data.
- HRM outperformed state-of-the-art CoT models on benchmarks like ARC-AGI, Sudoku-Extreme, and Maze-Hard, achieving near-perfect accuracy with minimal training data.
- HRM's efficiency offers significant cost savings and speedups, making it suitable for latency-sensitive fields like robotics and data-scarce domains like scientific exploration.
- Sapient Intelligence is evolving HRM into a general-purpose reasoning module, with promising applications in healthcare, climate forecasting, and robotics.