Can modern LLMs count the number of b's in "blueberry"?

12 days ago

Copy Link

OpenAI released GPT-5, which did not meet expectations, particularly in answering simple questions like counting letters in words.
GPT-5 incorrectly stated there are three 'b's in 'blueberry' when there are only two, a mistake replicated by multiple users.
The issue may stem from tokenization, where LLMs process text numerically rather than as individual letters, making letter counting difficult.
Despite tokenization challenges, some LLMs like Claude models correctly counted letters, showing variability in performance across different models.
Testing various LLMs revealed GPT-5's consistent errors in counting 'b's in 'blueberry', while other models like Claude and Gemini showed mixed results.
The post concludes that while LLMs can count letters, their accuracy is inconsistent, raising questions about their reliability for such basic tasks.

Hasty Briefsbeta