The Fundamental Limits of LLMs at Scale
5 days ago
- #LLM
- #Machine Learning
- #Theoretical Limits
- Large Language Models (LLMs) benefit from scaling but face five fundamental limitations: hallucination, context compression, reasoning degradation, retrieval fragility, and multimodal misalignment.
- Theoretical synthesis connects these limitations to foundational limits in computation, information, and learning, providing a unified framework.
- Computability and uncomputability imply irreducible errors, with diagonalization ensuring failure inputs and undecidable queries causing infinite failure sets.
- Information-theoretic constraints bound accuracy, finite description length enforces compression error, and long-tail knowledge requires high sample complexity.
- Geometric and computational effects compress long contexts below nominal size due to positional under-training, encoding attenuation, and softmax crowding.
- Likelihood-based training favors pattern completion over inference, and retrieval under token limits suffers from semantic drift and coupling noise.
- Multimodal scaling inherits shallow cross-modal alignment, with theorems and empirical evidence outlining scaling's benefits, saturation points, and limitations.
- Practical mitigation paths include bounded-oracle retrieval, positional curricula, and sparse or hierarchical attention.