LLM-assisted systematic review of large language models in clinical medicine - PubMed
4 hours ago
- #Systematic Review
- #LLM
- #Clinical Medicine
- LLM-assisted systematic review identified 4,609 peer-reviewed studies in clinical medicine from January 2022 to September 2025.
- Only 1,048 studies used real-world patient data, with just 19 being prospective randomized trials.
- Most studies addressed simulated scenarios (1,857) or exam-style tasks (1,704).
- ChatGPT and related OpenAI models were evaluated in 65.7% of studies, followed by Gemini/Bard at 13.1%.
- Patient-facing communication and education comprised 17% of tasks, followed by knowledge retrieval and education/assessment simulation.
- LLMs outperformed humans in 33% of 1,046 head-to-head comparisons, depending on task realism and training level.
- At least 25% of studies had sample sizes less than 30.
- Rigorous, patient-centered evidence remains scarce, highlighting the need for larger prospective trials before clinical adoption.