Evaluating Agent-Based Program Repair at Google
a year ago
- #software-engineering
- #LLMs
- #program-repair
- Agent-based program repair uses LLMs to automatically fix complex bugs by combining planning, tool use, and code generation.
- The paper evaluates agent-based repair in an enterprise context using 178 bugs from Google's issue tracking system (78 human-reported, 100 machine-reported).
- Passerine, an agent similar to SWE-Agent, achieves a 73% plausible patch rate for machine-reported bugs and 25.6% for human-reported bugs using Gemini 1.5 Pro.
- Manual examination shows 43% of machine-reported and 17.9% of human-reported bugs have semantically equivalent patches to the ground-truth.
- The study highlights differences in bug distribution (language diversity, size, spread of changes) between Google's dataset and the open-source SWE-Bench.