LLMs are still surprisingly bad at some simple tasks

7 hours ago

Copy Link

Three commercially available LLMs (ChatGPT, Google Gemini, Claude) were asked to identify TLDs matching HTML5 element names.
ChatGPT provided incorrect matches and missed several valid ones.
Google Gemini failed completely by listing HTML elements without matching TLDs.
Claude partially succeeded but also missed many correct answers and included irrelevant ones.
The author criticizes LLMs for their inaccuracy and over-reliance on sounding plausible rather than being correct.
The post highlights the gap between AI hype and actual performance on simple, factual tasks.
The author argues that familiarity with a domain reveals AI's limitations more clearly.
Comments reflect skepticism about AI's current capabilities and the hype surrounding it.

Hasty Briefsbeta