Next-Latent Prediction Transformers Learn Compact World Models (2025)

a day ago

The paper introduces Next-Latent Prediction (NextLat), an auxiliary training objective for transformers that adds self-supervised predictions in the latent space.
NextLat encourages transformers to form compact internal world models with belief states and transition dynamics, improving generalization over standard next-token prediction.
Theoretically, NextLat's latent representations converge to belief states, which are compressed summaries of history necessary for future predictions.
Empirically, NextLat shows gains in benchmarks like world modeling, reasoning, planning, and language modeling, enhancing accuracy, compression, and planning.
NextLat also enables variable-length self-speculative decoding, accelerating inference by up to 3.3x in language modeling while leaving transformer architecture and efficiency unchanged.

Hasty Briefsbeta