Can AI write reports like a radiologist? A blinded evaluation of large language model-generated lumbar spine MRI reports - PubMed

3 months ago

Study compares quality of LLM-generated vs. radiologist-written lumbar spine MRI reports.
125 reports (104 human, 21 AI) were blindly evaluated by 5 medical professionals.
Radiologist reports scored higher in clinical relevance, clarity, completeness, and accuracy.
No clinically false statements were found in AI-generated reports.
Identification accuracy varied among evaluators, with radiologists being most accurate.
AI-generated reports were sometimes indistinguishable from human ones, especially for non-specialists.
LLMs may assist radiologists in structured reporting and improve workflow efficiency.

Hasty Briefsbeta