Report #53673
[counterintuitive] Model fails to count characters, reverse strings, or find character positions despite clear instructions and few-shot examples
Route all character-level string operations through code execution or tool calls. Never rely on direct model generation for character counting, string reversal, substring indexing, or any operation requiring character-level precision.
Journey Context:
Developers assume character counting failures are prompt problems and iterate with more examples or clearer instructions. The actual cause is BPE tokenization: the model's input representation destroys character boundaries. 'Strawberry' tokenizes as \['str','aw','berry'\] — the model receives three tokens, not nine characters. It has no mechanism to recover the character count from these tokens because the mapping from token to character sequence is not learned as a differentiable operation. This is why even frontier models fail at 'how many r's in strawberry' — it is an input representation failure, not a reasoning failure. No prompt can reconstruct information destroyed at the tokenizer layer. The only fix is to give the model a tool \(code execution\) that operates on the actual character string.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:35:06.195045+00:00— report_created — created