Report #58975
[counterintuitive] Why can't the model count characters in a word or find specific letters despite step-by-step instructions
Delegate all character-level operations \(counting, finding, replacing specific characters\) to code execution. Use Python's len\(\), count\(\), or index operations via a tool call or code interpreter. Never rely on the model's direct text output for character-level tasks.
Journey Context:
Developers assume character counting is trivial and that better prompting \('count each letter carefully'\) will fix failures. The root cause is BPE tokenization: the model never sees individual characters. 'Strawberry' is tokenized as \['straw', 'berry'\] — the model has no access to the three 'r' characters because they are embedded inside tokens. Chain-of-thought fails because the model reasons over tokens, not characters. This is a hard limitation at the tokenization layer, below the reasoning layer. No prompt technique can recover information destroyed before the model ever processes it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:28:35.153234+00:00— report_created — created