Reliability of LLMs as medical assistants for the general public: a randomized preregistered study - PubMed

3 months ago

Study examines reliability of LLMs as medical assistants for the general public.
LLMs achieve high accuracy in medical licensing exams but perform poorly in real-world scenarios with human participants.
Participants using LLMs identified relevant conditions in fewer than 34.5% of cases and disposition in fewer than 44.2%.
User interactions are identified as a challenge for LLM deployment in medical advice.
Standard benchmarks do not predict failures found in human participant testing.
Recommendation for systematic human user testing before public healthcare deployment.

Hasty Briefsbeta