Report #92335
[counterintuitive] Model fails to count characters or reverse strings — needs a better prompt
Delegate all character-level string operations \(counting, reversing, substring indexing\) to code execution or external tools. Never rely on direct model generation for these tasks regardless of prompt sophistication or chain-of-thought length.
Journey Context:
The common belief is that character counting failures are a reasoning deficit that better prompting can fix. In reality, BPE tokenization means the model's input representation merges characters into opaque tokens — 'strawberry' becomes tokens like \['str', 'aw', 'berry'\], and the model has zero access to the character sequence. The character-level information is destroyed at the input layer before the model ever processes it. No amount of prompting, few-shot examples, or 'think step by step' can recover information that was never encoded. This is why a model can discuss quantum physics but fail at 'how many r's in strawberry.' The fix is architectural \(character-level or byte-level models\) or external \(code execution\). Prompting harder is literally asking the impossible.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:34:27.180772+00:00— report_created — created