Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space
4 months ago
- #Language Models
- #Machine Learning
- #Scaling Laws
- Proposes Dynamic Large Concept Models (DLCM), a hierarchical language modeling framework that shifts computation from tokens to a compressed concept space.
- DLCM discovers variable-length concepts end-to-end without predefined linguistic units, improving reasoning efficiency.
- Introduces the first compression-aware scaling law, enabling principled compute allocation under fixed FLOPs.
- Develops a decoupled μP parametrization for stable training across widths and compression regimes.
- Achieves a +2.69% average improvement across 12 zero-shot benchmarks under matched inference FLOPs.