ARC-AGI-3 benchmark is out now
6 hours ago
- #AI agent
- #performance metrics
- #game completion
- Dataset: ARC-AGI-3 Public Demo
- Human Actions To Complete Game
- Total Levels information available
- Model Performance comparison in a sortable table
- Cumulative actions by level can be viewed
- All Providers option available
- Table includes Model, Score, Actions, Replay, and Published columns
- Humans have a 100% score with no actions listed
- Loading scores message displayed