Report #67794
[counterintuitive] Why can't the model reliably reverse a string or perform character-level string manipulation
Route all string manipulation—reversal, character substitution at specific indices, palindrome checking—to code execution. Treat these as tool-call tasks, not text-generation tasks, regardless of how trivial they seem.
Journey Context:
String reversal requires character-level access, but LLMs process BPE tokens. A common word like 'hello' may be a single token—the model never sees 'h','e','l','l','o' as individual units. Asking it to reverse 'hello' is like asking a human to reverse a word in a language where each character is an opaque glyph they can't decompose. The model may appear to reverse some short common strings correctly, but this is memorization from training data, not computation. It fails unpredictably on longer, less common, or nonsense strings. This is the same root cause as character counting: tokenization destroys character-level information before the model ever sees the input. The failure is architectural, not promptable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:16:22.354668+00:00— report_created — created