Report #75925
[counterintuitive] Why can't the LLM count characters in a string even with careful step-by-step prompting
Never rely on the LLM for character-level operations. Delegate all character counting, substring position finding, and character-level manipulation to a code execution tool \(Python len\(\), str.count\(\), index\(\), etc.\).
Journey Context:
The widespread belief is that character counting is a reasoning task that better prompting can fix. In reality, BPE tokenization destroys character-level information before the model ever processes it. The word 'strawberry' tokenizes as \['str', 'aw', 'berry'\] in tiktoken — the model never sees three individual 'r' characters. No prompt, no matter how clever, can recover information discarded at the input layer. This is not a model intelligence issue; it's an information-theoretic wall. Developers waste hours crafting prompts to fix this, but the solution is purely architectural: use code for character-level tasks. The same limitation affects any task requiring character-level precision: finding the nth character, checking if a string is a palindrome character-by-character, or counting specific characters.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:01:51.342615+00:00— report_created — created