Hasty Briefsbeta

Bilingual

Comparison of the performance of large language models in answering patient questions related to cataract - PubMed

3 hours ago
  • #ophthalmology
  • #large language models
  • #health informatics
  • Study evaluated four large language models (ChatGPT o3-mini, Gemini 2.0 pro experimental, Deep Seek Thinking R1, Kimi Thinking K1.5) for answering cataract-related patient questions in Chinese.
  • DeepSeek Thinking R1 matched Gemini 2.0 pro experimental in accuracy and outperformed ChatGPT o3-mini and Kimi Thinking K1.5.
  • DeepSeek Thinking R1 excelled in completeness and consistency compared to the other models.
  • Legibility and safety were comparable among DeepSeek Thinking R1, Gemini 2.0 pro experimental, and ChatGPT o3-mini, all better than Kimi Thinking K1.5.
  • DeepSeek Thinking R1 showed the strongest overall performance in the evaluation.
  • Modern LLMs are promising for ophthalmology public education but require human oversight.