French Government Created LLM Leaderboard 'Rigged' for Mistral
14 days ago
- #energy-efficiency
- #AI-ranking
- #Bradley-Terry
- The ranking system compar:IA is based on user votes and reactions collected since October 2024.
- The ranking uses the Bradley-Terry model to calculate satisfaction scores, reflecting user preferences rather than technical performance.
- Energy consumption for models is estimated using the Ecologits methodology, considering model size and architecture.
- Proprietary models are excluded from energy consumption graphs due to lack of transparency in their data.
- The Bradley-Terry ranking method is more robust than simple win rates, accounting for opponent difficulty and uncertainty.
- Different model architectures (MoE, Dense, Matformer) impact energy efficiency and performance.
- Energy-efficient models are highlighted in the top-left of the satisfaction vs. energy consumption graph.
- The Ecologits methodology follows ISO 14044 standards, focusing on inference impact and GPU manufacturing.
- Simple win rate rankings can be biased by low match counts and do not account for opponent strength.
- The Bradley-Terry model provides a probabilistic and fair ranking by estimating outcomes even between uncompared models.