He asked AI to count carbs 27000 times. It couldn't give the same answer twice
5 hours ago
- #Diabetes Technology
- #AI Safety
- #Healthcare Risks
- AI models show high variability in carb estimates from the same food photo, posing risks for diabetes insulin dosing.
- Four models (OpenAI GPT-5.4, Anthropic Claude Sonnet 4.6, Google Gemini 2.5 Pro and 3.1 Pro) were tested with repeated queries, revealing inconsistent and sometimes dangerous estimates.
- Claude had the lowest variation (2.4% median CV) but still showed systematic bias, while Gemini models had high variability (up to 11% median CV).
- Models often misidentified foods (e.g., Bakewell tart called 'Linzer torte') and provided unreliable confidence scores with near-zero correlation to accuracy.
- The study advises against blind trust in AI carb counting, recommending multiple queries and cross-checking for safer use in diabetes apps.