My Participation in the METR AI Productivity Study
10 months ago
- #Developer Experience
- #Open Source
- #AI Productivity
- METR study found developers using AI took 19% longer to complete tasks (N=246 tasks, 95% CI [-40%, -2%]).
- The study involved a randomized controlled trial with developers working on tasks with and without AI assistance.
- The author participated in the study, working on the jsdom project, which has over 1 million lines of code.
- Tasks included bug fixes, feature implementations, and test coverage improvements, with 9 AI-allowed and 10 no-AI tasks.
- AI tools used included Cursor’s agent mode, Claude Code, and Gemini, but faced challenges with codebase consistency and specification implementation.
- AI models struggled with existing codebase styles, repetitive tasks, and accurately implementing web specifications.
- Despite feeling engaging, AI-assisted tasks were not faster due to frequent missteps and the need for constant oversight.
- The author suggests parallel-agents mode as a more promising approach for future AI-assisted productivity.
- Large, established codebases pose unique challenges for AI tools compared to greenfield projects.