Experts Have World Models. LLMs Have Word Models
4 hours ago
- #AI
- #Adversarial Reasoning
- #World Models
- Experts possess world models, while LLMs (Large Language Models) have word models, focusing on next token prediction rather than next state prediction.
- Three types of world models are discussed: 3D video world models, Meta's JEPA and related models, and multiagent world models for adversarial reasoning.
- The essay highlights the difference between evaluating text in isolation versus simulating how it will be received in a real-world context with other agents.
- Examples illustrate how domain experts anticipate adversarial reactions and hidden incentives, which LLMs currently fail to model effectively.
- Perfect-information games like chess differ from imperfect-information games like poker, where hidden state and adversarial adaptation are crucial.
- LLMs are optimized for producing coherent outputs but lack the ability to simulate multiagent environments where other parties adapt and counter.
- The core issue is not raw intelligence but the training loop—LLMs need to be graded on outcomes in adversarial settings rather than static outputs.
- Experts judge artifacts by their robustness under pressure, while outsiders focus on surface-level qualities like coherence and professionalism.
- The poker vs. chess analogy underscores the challenge of hidden state and adversarial dynamics in real-world applications.
- Future solutions may require multi-agent training environments where LLMs learn from outcomes and adapt to being modeled by others.