Hasty Briefsbeta

Bilingual

Large Language Model-Driven Analysis and Report Generation of Endoscopy Videos-A Pilot Study - PubMed

3 days ago
  • #clinical validation
  • #endoscopy
  • #artificial intelligence
  • Multimodal large language models (MLLMs) were tested for generating clinically adequate esophagogastroduodenoscopy (EGD) reports.
  • The study compared clean EGD videos versus those with computer-aided detection (CAD) overlays to assess MLLM performance.
  • Five blinded endoscopists rated report adequacy in completeness, visualization, and lesion characteristics.
  • MLLM completeness was rated adequate in 56.0% of clean videos versus 48.0% with CAD overlays (p = 0.500).
  • Visualization and lesion characteristics showed no significant difference between clean and overlay videos.
  • Landmark agreement accuracy was higher for clean videos (0.55) compared to overlay videos (0.33) (p = 0.029).
  • Gemini 2.5 Pro demonstrated inadequate performance for clinical EGD reporting, indicating a need for further optimization.
  • The study suggests larger-scale validation is required before deploying MLLMs in clinical settings.