The gap between open weights LLMs and closed source LLMs

4 hours ago

The plot illustrates the performance gap between open-source and closed-source large language models (LLMs) on a benchmark.
Gap measured by comparing current open-source performance to past closed-source performance levels.
Benchmark used is Artificial Analysis Intelligence Index, which correlates with perceived model capabilities.
Initial analysis suggests the gap will shrink to zero by December 3, 2026, hinting at convergence.
Extended analysis across 18 benchmarks shows most gaps remain steady at about 5 months, not shrinking significantly.
Significant improvement is observed only in coding benchmarks, reducing from 15 months to 1-2 months behind.
Highlighting challenges in measuring LLM quality as different benchmarks lead to varied predictions.

Hasty Briefsbeta