We Are Changing Our Developer Productivity Experiment Design
6 hours ago
- #AI-productivity
- #developer-studies
- #selection-bias
- METR's initial study (Feb-Jun 2025) found AI tools caused a 20% slowdown in task completion for open-source developers.
- A follow-up study (Aug 2025) with 57 developers faced reliability issues due to selection bias (developers refusing to work without AI) and pay rate reduction ($150/hr → $50/hr).
- Raw data suggests a shift from +19% slowdown (early 2025) to -18% speedup for returning developers, but confidence intervals overlap with no effect.
- Key challenges: Developers avoiding AI-disallowed tasks (30-50% admission), difficulty tracking time with agentic tools, and task/quality differences between conditions.
- Quotes reveal strong developer preference for AI (e.g., 'like taking an Uber vs walking').
- Proposed alternative research methods: Intensive short experiments, observational GitHub data, fixed-task designs, and developer-level randomization.
- Current study design likely underestimates true AI productivity gains due to missing data from high-adopters and AI-optimized tasks.