Hasty Briefsbeta

Bilingual

Laguna XS.2 and M.1

3 hours ago
  • #Agentic Coding
  • #AI Models
  • #Open Source
  • Released the Laguna family's first two models, Laguna M.1 (225B-A23B) and Laguna XS.2 (33B-A3B), with a runtime for training and operating agents, available via API and OpenRouter.
  • Laguna M.1 is a 225B parameter MoE model trained on 30T tokens, achieving 46.9% on SWE-bench Pro and 40.7% on Terminal-Bench 2.0; Laguna XS.2 is a smaller open-weight MoE model under Apache 2.0 license, scoring 44.5% on SWE-bench Pro and 30.1% on Terminal-Bench 2.0.
  • Focused on agentic coding models for long-horizon tasks, emphasizing coding capability as a core skill for agents to interact with the world through software composition and execution.
  • Utilized NVIDIA hardware for training, with Laguna XS.2 supported in TensorRT-LLM and optimized for Blackwell architecture; involved data curation, synthetic data (4.4T+ tokens), and AutoMixer for data mixture optimization.
  • Implemented the Muon optimizer for training efficiency, reducing steps by 15% compared to AdamW, with distributed computing to minimize overhead and ensure stability through hash checks.
  • Developed an asynchronous online RL system for agent training, handling long-horizon tasks with off-policy stability using CISPO algorithm, and efficient weight transfer via GPUDirect RDMA.
  • Released an agent harness (ACP server) for research preview, aligning with the vision of integrating models and agents closely for future software development.