Hasty Briefsbeta

Bilingual

Hope: A post-transformer architecture for general intelligence at low compute

16 hours ago
  • #post-transformer architecture
  • #general intelligence
  • #low compute AI
  • Hope is a research initiative exploring post-transformer architectures for general intelligence at low compute, arguing transformers compute the wrong probability operation for intelligence.
  • An initial validation tested seven pre-registered rungs; four cleared with models from 0.69M to 3M parameters, achieving 9.2% exact-match on novel ARC tasks, doubling the closest published baseline.
  • Key architectural components include a structured discrete latent space, a program decoder, and verifier-driven search at inference, designed to prevent posterior collapse and enable cross-task generalization.
  • Search over latent codes outperformed amortized inference on 95.5% of held-out instances, and in one case, lifted model performance from 0% to 100% exact-match without additional training.
  • Self-improvement experiments (Rung 5) did not achieve monotonic gains, with only transient improvements, highlighting challenges at small scale consistent with existing literature.
  • Scaling to 3M parameters (Rung 6) showed modest gains, indicating data limitations, while Rung 7, targeting 100M+ parameters and multi-domain benchmarks, is pending resources.
  • The initiative emphasizes pre-registration and transparent reporting of both successes and failures, with proprietary implementation details withheld for future phases.
  • Future work seeks funding to scale the architecture, validate verifier-closed self-improvement, and compete with frontier models on broader benchmarks.