Hasty Briefsbeta

Ling-1T: 1T-parameter model with 50B active parameters per token

10 hours ago
  • #AI
  • #Machine Learning
  • #Natural Language Processing
  • Ling-1T is the first flagship non-thinking model in the Ling 2.0 series with 1 trillion total parameters and 50 billion active parameters per token.
  • Pre-trained on 20 trillion+ high-quality, reasoning-dense tokens, supporting up to 128K context length with evolutionary chain-of-thought (Evo-CoT) process.
  • Achieves state-of-the-art performance on complex reasoning benchmarks, balancing accuracy and efficiency.
  • Excels in visual reasoning and front-end code generation with a hybrid Syntax–Function–Aesthetics reward mechanism.
  • Demonstrates emergent reasoning and transfer capabilities at the trillion-parameter level.
  • Built on the Ling 2.0 architecture designed for trillion-scale efficiency with key innovations like 1T total / 50B active parameters and FP8 training.
  • Post-training uses Evo-CoT for progressive reasoning enhancement and introduces LPO for sentence-level policy optimization.
  • Extensively evaluated across knowledge, code, math, reasoning, agent, and alignment benchmarks.
  • Available for download on Hugging Face and ModelScope, with quickstart guides for API usage and deployment.
  • Future plans include improving attention efficiency, agentic ability, and alignment.