Beyond Semantics: Unreasonable Effectiveness of Reasonless Intermediate Tokens
a year ago
- #Chain of Thought
- #Machine Learning
- #Transformer Models
- Recent impressive results from large reasoning models have been interpreted as a triumph of Chain of Thought (CoT).
- The paper critically examines the interpretation by investigating how the semantics of intermediate tokens influence model performance.
- Transformer models are trained on formally verifiable reasoning traces and solutions, aligning with a formal solver (A* search).
- Despite significant improvements, models trained on correct traces still produce invalid reasoning traces when arriving at correct solutions.
- Models trained on noisy, corrupted traces show performance largely consistent with models trained on correct data, sometimes improving upon it.
- The results challenge the assumption that intermediate tokens or 'Chains of Thought' induce predictable reasoning behaviors.
- The paper cautions against anthropomorphizing intermediate outputs or over-interpreting them as evidence of human-like or algorithmic behaviors in language models.