Hasty Briefsbeta

Bilingual

MiniMax M2.5 released: 80.2% in SWE-bench Verified

6 hours ago
  • #AI
  • #Machine Learning
  • #Productivity
  • MiniMax introduces M2.5, a faster, stronger, and smarter model optimized for real-world productivity.
  • M2.5 excels in coding, agentic tool use, search, and office work, with top scores in benchmarks like SWE-Bench Verified (80.2%) and BrowseComp (76.3%).
  • The model is cost-effective, priced at $1/hour for 100 tokens per second and $0.30/hour for 50 tokens per second.
  • M2.5 shows significant improvements in multilingual coding tasks and architectural planning, trained on 10+ languages across 200,000+ real-world environments.
  • Enhanced search and tool calling capabilities make M2.5 adept at expert-level tasks, with better efficiency and decision-making.
  • Office productivity is boosted with M2.5's ability to handle Word, PowerPoint, and Excel tasks, achieving a 59.0% win rate in evaluations.
  • M2.5 is 37% faster than its predecessor, M2.1, and matches Claude Opus 4.6's speed at a fraction of the cost.
  • The model supports agentic applications with two versions: M2.5 and M2.5-Lightning, differing in speed and cost.
  • MiniMax Agent integrates M2.5, offering standardized Office Skills and customizable Experts for various industries.
  • M2.5 is already handling 30% of MiniMax's internal tasks, with 80% of new code commits generated by the model.