Report #43195
[counterintuitive] Model keeps miscounting letters in a word despite examples and chain-of-thought
Never ask an LLM to count characters directly; delegate to code execution \(e.g., Python len\(\) or .count\(\)\) or an external tool for any character-level operation including substring counting, character position lookup, or spelling verification.
Journey Context:
LLMs tokenize input into subword units via BPE, not characters. The word 'strawberry' may tokenize as \['straw', 'berry'\] — the model literally cannot see three separate 'r' characters because they are embedded inside opaque token vectors. This information is destroyed at the input layer before the model begins reasoning. No amount of prompting, few-shot examples, or chain-of-thought can recover information lost before the first attention layer. This is why a model can discuss quantum physics but fail at 'how many r's in strawberry.' The fix is architectural \(character-level tokenization, which creates other problems\) or external \(tool use\), never prompt-based.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:58:41.932411+00:00— report_created — created