Ling-1T: 1T-parameter model with 50B active parameters per token

10 hours ago

Copy Link

Ling-1T is the first flagship non-thinking model in the Ling 2.0 series with 1 trillion total parameters and 50 billion active parameters per token.
Pre-trained on 20 trillion+ high-quality, reasoning-dense tokens, supporting up to 128K context length with evolutionary chain-of-thought (Evo-CoT) process.
Achieves state-of-the-art performance on complex reasoning benchmarks, balancing accuracy and efficiency.
Excels in visual reasoning and front-end code generation with a hybrid Syntax–Function–Aesthetics reward mechanism.
Demonstrates emergent reasoning and transfer capabilities at the trillion-parameter level.
Built on the Ling 2.0 architecture designed for trillion-scale efficiency with key innovations like 1T total / 50B active parameters and FP8 training.
Post-training uses Evo-CoT for progressive reasoning enhancement and introduces LPO for sentence-level policy optimization.
Extensively evaluated across knowledge, code, math, reasoning, agent, and alignment benchmarks.
Available for download on Hugging Face and ModelScope, with quickstart guides for API usage and deployment.
Future plans include improving attention efficiency, agentic ability, and alignment.

Hasty Briefsbeta