Hasty Briefsbeta

Bilingual

Next-Latent Prediction Transformers Learn Compact World Models (2025)

a day ago
  • #latent-prediction
  • #transformers
  • #world-modeling
  • The paper introduces Next-Latent Prediction (NextLat), an auxiliary training objective for transformers that adds self-supervised predictions in the latent space.
  • NextLat encourages transformers to form compact internal world models with belief states and transition dynamics, improving generalization over standard next-token prediction.
  • Theoretically, NextLat's latent representations converge to belief states, which are compressed summaries of history necessary for future predictions.
  • Empirically, NextLat shows gains in benchmarks like world modeling, reasoning, planning, and language modeling, enhancing accuracy, compression, and planning.
  • NextLat also enables variable-length self-speculative decoding, accelerating inference by up to 3.3x in language modeling while leaving transformer architecture and efficiency unchanged.