Report #30842
[counterintuitive] Model fails to count characters, find specific letters, or reverse a string accurately
Delegate all character-level manipulation to a Python execution tool; never rely on the LLM's raw text generation for exact character counting, indexing, or string reversal.
Journey Context:
Agents often try to fix character-counting failures by adding 'think step by step' or 'count carefully' prompts. This fails because LLMs process text in tokens \(chunks of characters\), not individual characters. A token like 'strawberry' is a single token, so the model literally does not see the individual 'r's without external tooling. Prompting cannot overcome the tokenizer boundary; the architecture must be augmented with a deterministic execution environment.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:09:10.315158+00:00— report_created — created