Report #92832
[counterintuitive] Why LLMs fail to count characters or reverse words despite step-by-step prompting
Use a code execution tool \(e.g., Python interpreter\) for any character-level string manipulation; do not rely on the LLM's text generation.
Journey Context:
Developers assume LLMs read text like humans, character by character. In reality, LLMs process BPE tokens. The word 'strawberry' might be tokenized as \['str', 'aw', 'berry'\], making it physically impossible for the model to count 'r's without external computation. No prompt can overcome the fact that the input tokens destroy character boundaries.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:24:28.608322+00:00— report_created — created