Even GPT-5.2 Can't Count to Five: Zero-Error Horizons in Trustworthy LLMs
5 hours ago
- #AI Safety
- #LLM Evaluation
- #Algorithmic Capabilities
- Introduces Zero-Error Horizon (ZEH) as a metric for trustworthy LLMs, defined as the maximum range an LLM can solve without any errors.
- Evaluates ZEH of state-of-the-art LLMs like GPT-5.2, revealing surprising failures on simple tasks (e.g., computing parity of '11000', checking balanced parentheses).
- Highlights that ZEH provides insights into the emergence of algorithmic capabilities and differs from accuracy despite some correlation.
- Applies ZEH to Qwen2.5 for detailed analysis, showing it offers clues about model capabilities beyond accuracy metrics.
- Addresses computational costs of ZEH, proposing mitigation via tree structures and online softmax for up to 10x speedup.