LLMs are still surprisingly bad at some simple tasks
5 hours ago
- #AI limitations
- #LLM critique
- #Tech skepticism
- Three commercially available LLMs (ChatGPT, Google Gemini, Claude) were asked to identify TLDs matching HTML5 element names.
- ChatGPT provided incorrect matches and missed several valid ones.
- Google Gemini failed completely by listing HTML elements without matching TLDs.
- Claude partially succeeded but also missed many correct answers and included irrelevant ones.
- The author criticizes LLMs for their inaccuracy and over-reliance on sounding plausible rather than being correct.
- The post highlights the gap between AI hype and actual performance on simple, factual tasks.
- The author argues that familiarity with a domain reveals AI's limitations more clearly.
- Comments reflect skepticism about AI's current capabilities and the hype surrounding it.