Report #72081
[counterintuitive] Why can't the model count characters in a word reliably no matter how I prompt it
Delegate all character-level operations \(counting, indexing, reversing\) to code execution or an external tool. Never rely on the model's direct character manipulation, regardless of prompt sophistication.
Journey Context:
Developers assume character counting is a reasoning task that better prompting can solve. It is not. LLMs ingest text through BPE tokenization, splitting words into subword tokens—'strawberry' becomes \['straw', 'berry'\]. The model never 'sees' individual 'r' characters; the information is literally absent from its input representation. No chain-of-thought, few-shot examples, or instruction tuning can recover information destroyed at the tokenizer boundary. This is why models famously fail 'how many r's in strawberry'—it is an input encoding failure, not a reasoning deficit. The same root cause breaks string reversal, substring extraction by index, and any operation requiring character-level fidelity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:33:58.357594+00:00— report_created — created