Hasty Briefsbeta

French Government Created LLM Leaderboard 'Rigged' for Mistral

14 days ago
  • #energy-efficiency
  • #AI-ranking
  • #Bradley-Terry
  • The ranking system compar:IA is based on user votes and reactions collected since October 2024.
  • The ranking uses the Bradley-Terry model to calculate satisfaction scores, reflecting user preferences rather than technical performance.
  • Energy consumption for models is estimated using the Ecologits methodology, considering model size and architecture.
  • Proprietary models are excluded from energy consumption graphs due to lack of transparency in their data.
  • The Bradley-Terry ranking method is more robust than simple win rates, accounting for opponent difficulty and uncertainty.
  • Different model architectures (MoE, Dense, Matformer) impact energy efficiency and performance.
  • Energy-efficient models are highlighted in the top-left of the satisfaction vs. energy consumption graph.
  • The Ecologits methodology follows ISO 14044 standards, focusing on inference impact and GPU manufacturing.
  • Simple win rate rankings can be biased by low match counts and do not account for opponent strength.
  • The Bradley-Terry model provides a probabilistic and fair ranking by estimating outcomes even between uncompared models.