Hope: A post-transformer architecture for general intelligence at low compute
16 hours ago
- #post-transformer architecture
- #general intelligence
- #low compute AI
- Hope is a research initiative exploring post-transformer architectures for general intelligence at low compute, arguing transformers compute the wrong probability operation for intelligence.
- An initial validation tested seven pre-registered rungs; four cleared with models from 0.69M to 3M parameters, achieving 9.2% exact-match on novel ARC tasks, doubling the closest published baseline.
- Key architectural components include a structured discrete latent space, a program decoder, and verifier-driven search at inference, designed to prevent posterior collapse and enable cross-task generalization.
- Search over latent codes outperformed amortized inference on 95.5% of held-out instances, and in one case, lifted model performance from 0% to 100% exact-match without additional training.
- Self-improvement experiments (Rung 5) did not achieve monotonic gains, with only transient improvements, highlighting challenges at small scale consistent with existing literature.
- Scaling to 3M parameters (Rung 6) showed modest gains, indicating data limitations, while Rung 7, targeting 100M+ parameters and multi-domain benchmarks, is pending resources.
- The initiative emphasizes pre-registration and transparent reporting of both successes and failures, with proprietary implementation details withheld for future phases.
- Future work seeks funding to scale the architecture, validate verifier-closed self-improvement, and compete with frontier models on broader benchmarks.