The gap between open weights LLMs and closed source LLMs
4 hours ago
- #AI Benchmarks
- #Performance Gap
- #Open-Source LLMs
- The plot illustrates the performance gap between open-source and closed-source large language models (LLMs) on a benchmark.
- Gap measured by comparing current open-source performance to past closed-source performance levels.
- Benchmark used is Artificial Analysis Intelligence Index, which correlates with perceived model capabilities.
- Initial analysis suggests the gap will shrink to zero by December 3, 2026, hinting at convergence.
- Extended analysis across 18 benchmarks shows most gaps remain steady at about 5 months, not shrinking significantly.
- Significant improvement is observed only in coding benchmarks, reducing from 15 months to 1-2 months behind.
- Highlighting challenges in measuring LLM quality as different benchmarks lead to varied predictions.