Hasty Briefsbeta

Bilingual

Evaluation of validity, reliability, and readability of AI chatbots for gestational diabetes mellitus: a multi-model comparative study - PubMed

a day ago
  • #gestational diabetes mellitus
  • #AI chatbots
  • #health information
  • The study evaluates the validity, reliability, and readability of six AI chatbots for gestational diabetes mellitus (GDM) information.
  • ChatGPT-5 achieved the highest accuracy (92.17%) in answering GDM-related multiple-choice questions.
  • Newer AI models consistently outperformed their predecessors across all domains of GDM knowledge.
  • ChatGPT-5 also scored highest in reliability for public-education questions but had poor transparency scores.
  • All AI models produced text above the recommended sixth-grade reading level, making them unsuitable as stand-alone patient education resources.
  • The study concludes that AI chatbots should be used as adjuncts to clinician counseling, not as primary resources.