Report #62819
[counterintuitive] LLM fails to count characters or reverse strings — needs better prompting
Delegate all character-level operations \(counting, reversal, substring extraction\) to code execution or external tools. No prompt technique overcomes BPE tokenization blindness.
Journey Context:
BPE tokenization means the model's input representation does not contain individual characters. 'Strawberry' is typically a single token, not s-t-r-a-w-b-e-r-r-y. The model literally cannot see the letters. This is not a reasoning failure — it is an input representation failure. Asking the model to 'think step by step' about character counting is asking it to reason about information it does not possess. The only fixes are architectural \(character-level tokenization\) or tool-based \(code execution\). This is why every major LLM, regardless of size, fails the 'how many r's in strawberry' test without tool use.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:55:26.288806+00:00— report_created — created