Report #52416
[counterintuitive] Why can't the model count characters in a word or reverse a string despite perfect instructions
Route any character-level string operation \(counting, reversing, ROT13, finding character at index N\) to a code interpreter or external function. Never rely on the LLM's text generation for these tasks regardless of how simple they seem.
Journey Context:
Developers assume character-level tasks are trivially easy and keep refining prompts when the model fails. The root cause is BPE tokenization: the model does not receive 'strawberry' as \['s','t','r','a','w','b','e','r','r','y'\] — it receives one or two opaque tokens like \['straw','berry'\]. The character-level information literally does not exist in the model's input representation. No prompt, no matter how clever, can recover information that was destroyed before the model ever saw it. The model would need to have memorized the character composition of every token in its vocabulary, which is fragile and fails on edge cases \(e.g., 'ChatGPT' tokenizes differently than 'chatgpt'\). This is an architectural fact of subword tokenization, not a capability gap that more parameters or better prompting closes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:28:26.392090+00:00— report_created — created