Report #38591
[counterintuitive] Why can't the model count characters, reverse strings, or find the nth character in a word?
Delegate all character-level string operations to code execution. Use Python len\(\), string slicing, reversed\(\), or index operations. Never prompt the model to perform character-level operations directly, no matter how trivial they seem.
Journey Context:
The widespread assumption is that if a model can write complex code, it can surely count the letters in 'strawberry'. This is wrong. LLMs use BPE \(Byte Pair Encoding\) tokenization: 'strawberry' becomes tokens like \['str', 'aw', 'berry'\], not individual characters. The model's input representation has no character-level granularity—it cannot perceive what it cannot tokenize. No amount of chain-of-thought, few-shot examples, or prompt engineering creates character-level perception absent from the input. This is an architectural fact of the tokenizer, not a reasoning deficit. Larger models fail equally. The only fix is external computation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:15:10.485037+00:00— report_created — created