New AI architecture is 100x faster in reasoning with just 1k training examples

10 months ago

Sapient Intelligence, a Singapore-based AI startup, developed the Hierarchical Reasoning Model (HRM), a new AI architecture that outperforms large language models (LLMs) in complex reasoning tasks while being smaller and more data-efficient.
HRM is inspired by the human brain's use of distinct systems for slow, deliberate planning and fast, intuitive computation, achieving impressive results with less data and memory than LLMs.
Current LLMs rely on chain-of-thought (CoT) prompting, which has limitations like brittleness and high data requirements, leading to long, slow responses.
HRM uses 'latent reasoning,' reasoning internally without explicit language generation, similar to human thought processes, avoiding issues like vanishing gradients and early convergence.
HRM consists of two coupled modules: a high-level (H) module for abstract planning and a low-level (L) module for fast computations, enabling deep reasoning without extensive data.
HRM outperformed state-of-the-art CoT models on benchmarks like ARC-AGI, Sudoku-Extreme, and Maze-Hard, achieving near-perfect accuracy with minimal training data.
HRM's efficiency offers significant cost savings and speedups, making it suitable for latency-sensitive fields like robotics and data-scarce domains like scientific exploration.
Sapient Intelligence is evolving HRM into a general-purpose reasoning module, with promising applications in healthcare, climate forecasting, and robotics.

Hasty Briefsbeta