Report #83445
[counterintuitive] Why can't the model count characters in a word no matter how I prompt it?
Never rely on the model for character-level operations. Delegate counting, reversing, ROT13, substring-by-index, or any character-precise task to code execution or a post-processing script.
Journey Context:
The common belief is that character-counting failures are a reasoning deficit fixable with better prompting — 'count carefully', 'go letter by letter', 'think step by step'. This is wrong. The root cause is subword tokenization \(BPE\): the model literally does not receive individual characters as input. 'Strawberry' may be tokenized as \['str', 'aw', 'berry'\], and the character 'r' is not a discrete unit the model can inspect. Chain-of-thought sometimes appears to work for short common words because the model has memorized their spellings from training data, but this breaks unpredictably for uncommon words, longer strings, or non-English text. No amount of prompt engineering or model scaling fixes this because the character-level information is destroyed at the tokenization layer before the model ever processes it. The same limitation applies to string reversal, character-level ciphers, and index-based substring extraction.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:38:46.184731+00:00— report_created — created