The latest AI scaling graph – and why it hardly makes sense
a year ago
- #AI
- #Critique
- #Scaling
- METR published a study on AI performance in software-related tasks, leading to a viral graph.
- The graph's y-axis measures AI performance based on human time to solve tasks, which is criticized as arbitrary and flawed.
- METR's technical report was careful, but social media posts exaggerated the findings beyond the study's scope.
- The dataset of software tasks was well-constructed but may not generalize to other cognitive domains.
- Extrapolating AI capabilities from the graph is seen as misguided, with exponential growth assumptions being unreliable.
- Confirmation bias and hype are more prevalent among investors than builders in the AI field.