Hasty Briefsbeta

Can modern LLMs count the number of b's in "blueberry"?

12 days ago
  • #AI
  • #LLM
  • #GPT-5
  • OpenAI released GPT-5, which did not meet expectations, particularly in answering simple questions like counting letters in words.
  • GPT-5 incorrectly stated there are three 'b's in 'blueberry' when there are only two, a mistake replicated by multiple users.
  • The issue may stem from tokenization, where LLMs process text numerically rather than as individual letters, making letter counting difficult.
  • Despite tokenization challenges, some LLMs like Claude models correctly counted letters, showing variability in performance across different models.
  • Testing various LLMs revealed GPT-5's consistent errors in counting 'b's in 'blueberry', while other models like Claude and Gemini showed mixed results.
  • The post concludes that while LLMs can count letters, their accuracy is inconsistent, raising questions about their reliability for such basic tasks.