PRs and LLMs

13 hours ago

Senior engineers spend too much time reviewing PRs from coding agents, reducing coding time.
AI-generated code is perceived as not good enough and lacks trust from experienced engineers.
There is concern about merging large changes quickly into products without proper absorption.
Feldera shifts burden to PR authors by requiring unit tests, 99% coverage, and validation tests that must fail.
Autonomous end-to-end testing for substantial PRs, with summaries in PR descriptions.
Invest in executable specifications for APIs to constrain agent-written code and enable property-based testing.
Make executable specifications and implementations interchangeable for easier bug isolation.
Evaluate generated code based on functionality, data structures, API design, clean code principles, and performance.
Senior engineers use institutional memory to slow changes in risky code areas.
Complicated, risky code is a liability as agents may perform worse than humans on it, prompting refactoring.

Hasty Briefsbeta