Hasty Briefsbeta

Bilingual

Kimi K2.6 just beat Claude, GPT-5.5, and Gemini in a coding challenge

5 hours ago
  • #AI Programming Contest
  • #Chinese AI
  • #Open-Weights Models
  • Kimi K2.6, an open-weights Chinese model from Moonshot AI, won a programming challenge (Word Gem Puzzle) by scoring 22 match points, beating Western models like GPT-5.5 and Claude Opus 4.7.
  • The Word Gem Puzzle involves sliding tiles on grids to form valid English words, with scoring that rewards longer words and penalizes short ones.
  • Kimi K2.6's strategy was aggressive sliding to unlock high-value words, which worked well on larger scrambled grids (e.g., 30x30), while other models like MiMo V2-Pro relied on static scanning of intact seed words.
  • MiMo V2-Pro placed second with 20 match points, showing a different approach, and the results highlight how model strategies (e.g., sliding vs. scanning) affect performance based on grid conditions.
  • The challenge reveals that open-weights models are closing the capability gap with frontier labs, with Kimi K2.6 scoring close to top Western models on benchmarks, indicating a shift in the competitive landscape.
  • Notable underperformers included Muse Spark, which scored -15,309 due to claiming all words regardless of length penalties, and DeepSeek V4, which produced malformed output.