AI made every test pass, but the code was still wrong
7 days ago
- #AI Testing
- #Solidity
- #Software Validation
- Doodledapp converts visual node graphs into Solidity smart contracts.
- The team tested 17 real-world contracts using roundtrip testing to validate the converter.
- AI-generated tests passed all checks on the first run, revealing a flaw in testing methodology.
- The AI tested the implementation, not the intent, confirming what the code did rather than its correctness.
- Researchers identified this as the 'ground truth problem'—AI lacks an independent source of truth.
- The team restructured the approach to compare contracts at the AST level for semantic correctness.
- The revised method successfully identified and fixed bugs by comparing against original contracts.
- Key takeaway: AI needs a reference point to validate correctness, not just implementation.