Hasty Briefsbeta

Bilingual

It's Hard to Eval Is a Product Smell

9 hours ago
  • #Verification
  • #Product design
  • #AI evals
  • The statement 'our product is hard to eval' is a product smell, indicating potential usability issues for both developers and users.
  • Designing products for ease of verification should prioritize user trust and reduce the need for users to redo work to verify AI outputs.
  • Example 1: AI data agents should provide checkable artifacts, like interactive notebooks with progressive disclosure, to allow verification through trusted sources, metric definitions, and sanity checks.
  • Example小孩2: An AI tool for physical education lesson plans should anchor generated plans in vetted, existing plans, showing diffs and reasons for changes to reduce cognitive load during review.
  • Example 3: A workers' compensation medical report tool should act as a research assistant, surfacing evidence, contradictions, and gaps for doctors to verify before assembling final reports.
  • General principles include understanding user verification needs, using provenance and progressive disclosure, and breaking workflows into smaller, verifiable units.
  • Designing for verification aligns with established principles like needfinding and sensemaking, and is crucial in the AI era where verification is a bottleneck.