Ling-1T: 1T-parameter model with 50B active parameters per token
10 hours ago
- #AI
- #Machine Learning
- #Natural Language Processing
- Ling-1T is the first flagship non-thinking model in the Ling 2.0 series with 1 trillion total parameters and 50 billion active parameters per token.
- Pre-trained on 20 trillion+ high-quality, reasoning-dense tokens, supporting up to 128K context length with evolutionary chain-of-thought (Evo-CoT) process.
- Achieves state-of-the-art performance on complex reasoning benchmarks, balancing accuracy and efficiency.
- Excels in visual reasoning and front-end code generation with a hybrid Syntax–Function–Aesthetics reward mechanism.
- Demonstrates emergent reasoning and transfer capabilities at the trillion-parameter level.
- Built on the Ling 2.0 architecture designed for trillion-scale efficiency with key innovations like 1T total / 50B active parameters and FP8 training.
- Post-training uses Evo-CoT for progressive reasoning enhancement and introduces LPO for sentence-level policy optimization.
- Extensively evaluated across knowledge, code, math, reasoning, agent, and alignment benchmarks.
- Available for download on Hugging Face and ModelScope, with quickstart guides for API usage and deployment.
- Future plans include improving attention efficiency, agentic ability, and alignment.