Report #61646
[counterintuitive] Why can't the model count characters or reverse strings correctly even with careful prompting
Delegate all character-level string operations \(counting, reversing, substring checks\) to code execution. Never rely on the model's direct text output for character-precise tasks.
Journey Context:
LLMs operate on subword tokens via BPE, not on individual characters. The word 'strawberry' may tokenize as \['str', 'aw', 'berry'\] — the model literally cannot see the three 'r's as separate entities. Prompting 'count carefully' or 'go letter by letter' creates a simulacrum of counting that operates on token-level approximations and fails unpredictably. This is not a laziness or attention issue — the character-level information is simply not available in the model's input representation. No prompt engineering can recover information destroyed by tokenization. This is why the model can write a Python function to count characters perfectly but cannot count them directly: the code execution path operates on actual characters, while the text generation path operates on tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:57:52.697505+00:00— report_created — created