Report #70237
[counterintuitive] LLM fails to count characters or spell words correctly because it needs a better step-by-step prompt
Offload character-level tasks \(counting, spelling, reversing\) to a Python interpreter or external script; never trust the LLM's native token-sequence generation for character manipulation.
Journey Context:
Humans see text as characters; LLMs see text as subword tokens \(BPE\). A token might be 'str', 'aw', 'berry', or the whole word. Asking an LLM to count 'r's in 'strawberry' requires it to map tokens back to characters, a task it was not architecturally built to do. No prompt engineering can reliably bridge the token-character gap because the character boundaries are fundamentally lost during encoding.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:28:13.932833+00:00— report_created — created