Report #99064
[counterintuitive] Model miscounts letters in a word or fails exact token-length tasks despite clear instructions
Do not expect exact counting from an LLM. Pre-tokenize with the target model's tokenizer and pass the token or character count explicitly, or use regex/code tools for exact string manipulation.
Journey Context:
Developers treat counting as a primitive skill and assume 'count carefully' or step-by-step prompting fixes it. The root cause is BPE/WordPiece tokenization: models do not see characters as atomic units but as merged subword fragments. Counting crosses token boundaries and requires exact boundary tracking the architecture does not naturally support. The correct response is not prompt engineering but deterministic code or tokenizer APIs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T05:15:01.499732+00:00— report_created — created