Report #42086
[counterintuitive] LLM fails to count characters in a word despite elaborate prompting
Route all character-level, substring, or byte-level string operations through code execution \(tool use\). Never rely on the model's direct text output for counting, indexing, or substring tasks, regardless of how you prompt.
Journey Context:
Developers assume character counting is trivial and try increasingly clever prompts: spell-it-out chains, step-by-step decomposition, verification loops. None work reliably. The root cause is tokenization: LLMs ingest BPE tokens, not characters. The word 'strawberry' may be a single token ID — the model's input representation literally does not contain the character sequence 's-t-r-a-w-b-e-r-r-y'. It can memorize character counts for common short words but cannot decompose arbitrary tokens into characters because that information is destroyed at the tokenizer boundary. This is not a reasoning deficit that more parameters or better prompts fix; it is an information-theoretic gap between the model's input representation and the task's requirements.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:06:42.928541+00:00— report_created — created