Hasty Briefsbeta

Bilingual

AI made every test pass, but the code was still wrong

7 days ago
  • #AI Testing
  • #Solidity
  • #Software Validation
  • Doodledapp converts visual node graphs into Solidity smart contracts.
  • The team tested 17 real-world contracts using roundtrip testing to validate the converter.
  • AI-generated tests passed all checks on the first run, revealing a flaw in testing methodology.
  • The AI tested the implementation, not the intent, confirming what the code did rather than its correctness.
  • Researchers identified this as the 'ground truth problem'—AI lacks an independent source of truth.
  • The team restructured the approach to compare contracts at the AST level for semantic correctness.
  • The revised method successfully identified and fixed bugs by comparing against original contracts.
  • Key takeaway: AI needs a reference point to validate correctness, not just implementation.