Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

5 months ago

Proposes Dynamic Large Concept Models (DLCM), a hierarchical language modeling framework that shifts computation from tokens to a compressed concept space.
DLCM discovers variable-length concepts end-to-end without predefined linguistic units, improving reasoning efficiency.
Introduces the first compression-aware scaling law, enabling principled compute allocation under fixed FLOPs.
Develops a decoupled μP parametrization for stable training across widths and compression regimes.
Achieves a +2.69% average improvement across 12 zero-shot benchmarks under matched inference FLOPs.

Hasty Briefsbeta