The effects of multitype prompt engineering for large language models in hypertension treatment decisions - PubMed
5 hours ago
- #Large Language Models
- #Prompt Engineering
- #Hypertension Treatment
- Multitype prompt engineering significantly affects large language model (LLM) performance in hypertension treatment decision-making.
- A study using 300 simulated hypertension cases found ChatGPT-4.1 with Guidance-Self-Consistency achieved optimal accuracy (91.3%), nearing expert level.
- Optimal LLM assistance improved physician accuracy across hospital levels (e.g., community hospital from 73.4% to 82.5%) and reduced inappropriate regimen rates.
- Poor LLM configurations, like zero-shot prompting, decreased physician performance and increased inappropriate regimen rates from 26.6% to 35.2%.
- Effectively designed prompt strategies enable LLMs to provide reliable hypertension treatment recommendations, supporting clinical decisions.