Report #94107
[counterintuitive] Why can't the LLM count characters, reverse strings, or find the nth character reliably no matter how I prompt it?
Never rely on the model's direct text output for any character-level operation. Always delegate to a code execution tool \(Python: len\(s\), s\[::-1\], s\[n\]\) for counting, reversing, indexing, or any character-aware manipulation.
Journey Context:
LLMs process BPE tokens, not characters. The string 'strawberry' may tokenize as \['straw', 'berry'\] — the model never sees the three individual 'r' characters. This is input representation destruction, not a reasoning deficit. No prompt, no matter how elaborate, can recover information lost during tokenization. The model can write correct Python to count characters but cannot perform the count itself because it lacks the primitive representation. This is why GPT-4 fails 'how many r's in strawberry' while executing len\(\[c for c in 'strawberry' if c == 'r'\]\) trivially succeeds. The failure is at the architecture level, not the prompting level.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:32:49.739847+00:00— report_created — created