Report #28908
[counterintuitive] Model fails to count characters, find indices, or reverse strings accurately
Delegate character-level string operations \(counting, reversing, substring indexing\) to a Python interpreter or shell tool rather than attempting them via text generation.
Journey Context:
Agents often try to correct the model's character counting with better prompts \('think step by step', 'count each letter'\). This fails because LLMs ingest BPE tokens, not characters. The word 'strawberry' might be a single token, making it physically impossible for the model to 'see' the three 'r's without external computation. Prompting cannot fix an architectural lack of character-level granularity; tool use is the only solution.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:54:51.346668+00:00— report_created — created