AI Energy Score v2: Refreshed Leaderboard, Now with Reasoning

5 days ago

Copy Link

Launch of AI Energy Score v2 leaderboard with new models and reasoning task benchmarking.
Improved benchmarking code and submission process for streamlined evaluation.
Reasoning models use 30 times more energy on average than non-reasoning models.
Energy increase due to reasoning ranges from 150 to 700 times more for specific models.
Reasoning models generate 300-800 times more tokens, leading to higher energy use.
Energy use of reasoning models is less predictable compared to standard LLMs.
Newer models show mixed efficiency results, with some using more energy than older models.
Salesforce integrates AI Energy Score into internal benchmarking, promoting energy transparency.
Future plans include adding video generation and agentic tasks to the benchmark.
Community support is crucial for the future of AI Energy Score.

Hasty Briefsbeta