Report #63034
[counterintuitive] Why can't the LLM count characters in a word or spell it backwards despite being told to think carefully
Delegate all character-level operations \(counting, reversing, substring extraction\) to code execution or external tools. Never rely on the model's direct text generation for these tasks regardless of prompt sophistication or model size.
Journey Context:
LLMs tokenize input into subword units \(BPE tokens\) before processing. The model never sees individual characters—it sees integer token IDs. 'Strawberry' might tokenize as roughly 3 tokens, making it structurally impossible for the model to count 'r's by inspecting the word. This is a perceptual limitation, not a reasoning deficit. No amount of chain-of-thought prompting, few-shot examples, or model scaling fixes this because the character-level information is destroyed at the input layer before the model even begins processing. Developers burn hours crafting increasingly elaborate prompts for what is fundamentally an input representation problem. The model can, however, write Python code that performs these operations correctly—delegate to code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T12:17:11.065403+00:00— report_created — created