We Are Changing Our Developer Productivity Experiment Design

6 hours ago

METR's initial study (Feb-Jun 2025) found AI tools caused a 20% slowdown in task completion for open-source developers.
A follow-up study (Aug 2025) with 57 developers faced reliability issues due to selection bias (developers refusing to work without AI) and pay rate reduction ($150/hr → $50/hr).
Raw data suggests a shift from +19% slowdown (early 2025) to -18% speedup for returning developers, but confidence intervals overlap with no effect.
Key challenges: Developers avoiding AI-disallowed tasks (30-50% admission), difficulty tracking time with agentic tools, and task/quality differences between conditions.
Quotes reveal strong developer preference for AI (e.g., 'like taking an Uber vs walking').
Proposed alternative research methods: Intensive short experiments, observational GitHub data, fixed-task designs, and developer-level randomization.
Current study design likely underestimates true AI productivity gains due to missing data from high-adopters and AI-optimized tasks.

Hasty Briefsbeta