Binary Retrieval-Augmented Reward Mitigates Hallucinations
a day ago
- #hallucination-mitigation
- #language-models
- #reinforcement-learning
- Proposes a binary retrieval-augmented reward (RAR) method to mitigate hallucinations in language models.
- Achieves a 39.3% reduction in hallucination rates for open-ended generation.
- Enables calibrated abstention in short-form question answering, reducing incorrect answers by 44.4% on PopQA and 21.7% on GPQA.
- Maintains performance on instruction following, math, and coding tasks without degradation.
- Outperforms supervised training and continuous-reward RL baselines in factuality.