Hasty Briefsbeta

Bilingual

GLM-5.2: Built for Long-Horizon Tasks

6 hours ago
  • #LLM
  • #Long-Context
  • #Coding-Agent
  • GLM-5.2 features a solid 1M-token context for stable performance on long-horizon tasks, including complex coding and agentic scenarios.
  • Introduces IndexShare architecture to reduce per-token FLOPs by 2.9x at 1M context and improves MTP layer for speculative decoding, increasing acceptance length by up to 20%.
  • Offers advanced coding capabilities with adjustable thinking effort levels, allowing users to balance performance and latency, and includes an anti-hack module to prevent reward hacking in coding agents.
  • Achieves strong benchmark results: highest-ranked open-source model on long-horizon coding benchmarks (e.g., FrontierSWE, PostTrainBench, SWE-Marathon) and top open-source performer on standard coding benchmarks (e.g., Terminal-Bench 2.1, SWE-bench Pro).
  • Uses the slime infrastructure for efficient agentic RL post-training, supporting large-scale, parallel OPD training and flexible inference integration.
  • Optimized for efficient 1M context serving with enhancements in KV-cache management, kernel coordination, and CPU-side optimizations to improve throughput and scalability.
  • Released under an MIT open-source license with no regional restrictions, and available via GLM Coding Plan, Z.ai chat, and local deployment through platforms like HuggingFace and ModelScope.