Can modern LLMs count the number of b's in "blueberry"?
12 days ago
- #AI
- #LLM
- #GPT-5
- OpenAI released GPT-5, which did not meet expectations, particularly in answering simple questions like counting letters in words.
- GPT-5 incorrectly stated there are three 'b's in 'blueberry' when there are only two, a mistake replicated by multiple users.
- The issue may stem from tokenization, where LLMs process text numerically rather than as individual letters, making letter counting difficult.
- Despite tokenization challenges, some LLMs like Claude models correctly counted letters, showing variability in performance across different models.
- Testing various LLMs revealed GPT-5's consistent errors in counting 'b's in 'blueberry', while other models like Claude and Gemini showed mixed results.
- The post concludes that while LLMs can count letters, their accuracy is inconsistent, raising questions about their reliability for such basic tasks.