- ChatGPT passes the “strawberry” test but fails when switched to “cranberry”
- AI still struggles with simple letter counting despite wider improvements
- Sanity tests like “car wash” still reveal holes in AI logic
There are a number of viral posts from people who are amazed that chatbots like ChatGPT and Claude can solve complex equations, but struggle with something as simple as counting the number of “r”s in the word “strawberry.” Well, those days could finally be over.
With the words “Finally,” the official ChatGPTapp X account proudly announced today that it can now count the number of “r’s” in “strawberry” — a ridiculously easy task for humans that has traditionally been difficult for AIs to get right.
However, users found out very quickly that you could still get around it by replacing “strawberry” with “cranberry”.
The article continues below
“Not so fast,” X user @NathanEspinoza_ said in response to ChatGPTapp’s bragging post about solving the strawberry problem, posting a photo showing ChatGPT had the answer, saying there was only an “r” in “cranberry.”
To confirm the result, I quickly tried the same with my version of ChatGPT on GPT-5.5 and was told that there were two “r’s” – a different result, but still wrong. It passed the “strawberry” test perfectly, saying there were three “r’s,” but then claimed there were only two in “cranberry.” To its credit, ChatGPT admitted its error when I questioned it, putting it down to a simple “count error”.
Why the strawberry problem exists
There are a few very simple questions that chatbots are notoriously bad at answering, one of which is “how many ‘r’s are there in strawberry?”
This is a straightforward counting task for humans, but it is surprisingly difficult for AI systems. The reason comes down to how they process the language. Large language models (LLMs) are built on transformers, which convert words like “strawberry” into numerical representations. These representations capture meaning and context, but they do not inherently preserve a clear sense of the individual letters that make up the word.
The fact that ChatGPT is still tripping over “cranberry” suggests that the solution may have been hard-coded for specific cases, rather than reflecting a broader improvement in how LLM handles these kinds of issues.
The car wash problem
The other boast in ChatGPTapp’s post is that ChatGPT can now solve the car wash problem. This exploits a context gap in how LLMs reason by asking whether it would be faster to go to a car wash or drive if it is “only 50 meters away”. Most models will tell you it’s faster to walk, and lack the obvious problem of having to bring your car with you to wash it.
ChatGPTapp claims that ChatGPT will now catch this error and point it out. But when I tried it with the latest GPT-5.5 model, it still recommended to go – just like Claude was using Sonnet 4.6. When I tested it in the Gemini, however, it pointed out that while walking would be faster, you’d need to bring the car if the goal was to wash it.
Grok made it even better. Not only did it flag the question about not bringing the car, but it added that “this question has become a popular test of whether someone (or an AI) understands the actual goal versus giving generic ‘walking is healthier/shorter/greener’ advice that ignores context.”
So for now at least it’s a win for Gemini and Grok. But if fixing “strawberry” doesn’t fix “cranberry,” it raises a bigger question—are these models actually getting smarter, or are they just getting better at passing the tests we keep throwing at them?
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds.

The best business laptops for all budgets



