Hasty Briefsbeta

Bilingual

Why LLMs still lack taste

9 hours ago
  • #LLMs
  • #Software Development
  • #AI Taste
  • LLMs demonstrate advanced capabilities in software development but lack 'taste'—the ability to choose the best option from correct alternatives.
  • Taste is context-dependent, subjective, and crucial for long-term maintainability, but LLMs struggle due to their reliance on verifiable rewards in training.
  • Humans acquire taste through years of experience in varied contexts, learning which code properties are desirable, unlike LLMs' short, objective-focused training.
  • RLVR (Reinforcement Learning from Verifiable Rewards) improves coding but fails to capture long-term goals like maintainability and uptime, as rewards are narrow.
  • A proposed solution involves a long-horizon RLVR harness simulating real-world SaaS environments with diverse users and monetary rewards to teach taste.