Hasty Briefsbeta

Bilingual

Evaluating Agent-Based Program Repair at Google

a year ago
  • #software-engineering
  • #LLMs
  • #program-repair
  • Agent-based program repair uses LLMs to automatically fix complex bugs by combining planning, tool use, and code generation.
  • The paper evaluates agent-based repair in an enterprise context using 178 bugs from Google's issue tracking system (78 human-reported, 100 machine-reported).
  • Passerine, an agent similar to SWE-Agent, achieves a 73% plausible patch rate for machine-reported bugs and 25.6% for human-reported bugs using Gemini 1.5 Pro.
  • Manual examination shows 43% of machine-reported and 17.9% of human-reported bugs have semantically equivalent patches to the ground-truth.
  • The study highlights differences in bug distribution (language diversity, size, spread of changes) between Google's dataset and the open-source SWE-Bench.