Hope: A post-transformer architecture for general intelligence at low compute

16 hours ago

Hope is a research initiative exploring post-transformer architectures for general intelligence at low compute, arguing transformers compute the wrong probability operation for intelligence.
An initial validation tested seven pre-registered rungs; four cleared with models from 0.69M to 3M parameters, achieving 9.2% exact-match on novel ARC tasks, doubling the closest published baseline.
Key architectural components include a structured discrete latent space, a program decoder, and verifier-driven search at inference, designed to prevent posterior collapse and enable cross-task generalization.
Search over latent codes outperformed amortized inference on 95.5% of held-out instances, and in one case, lifted model performance from 0% to 100% exact-match without additional training.
Self-improvement experiments (Rung 5) did not achieve monotonic gains, with only transient improvements, highlighting challenges at small scale consistent with existing literature.
Scaling to 3M parameters (Rung 6) showed modest gains, indicating data limitations, while Rung 7, targeting 100M+ parameters and multi-domain benchmarks, is pending resources.
The initiative emphasizes pre-registration and transparent reporting of both successes and failures, with proprietary implementation details withheld for future phases.
Future work seeks funding to scale the architecture, validate verifier-closed self-improvement, and compete with frontier models on broader benchmarks.

Hasty Briefsbeta