Report #93080
[counterintuitive] Why does the model fail at reversing strings or doing character-level string operations
Delegate all character-level string operations — reversal, anagram checking, palindrome verification, substring indexing — to code execution, never to text generation.
Journey Context:
String reversal looks like a simple algorithmic task, so developers assume the model just needs better instructions. But BPE tokenization means 'hello' might be a single token \[15339\], not the sequence \['h','e','l','l','o'\]. Reversing a single token is undefined — the model must first infer the character composition of the token \(itself a lossy guess\), then reverse that inferred composition, then emit the result. Each step compounds error. This is not the model being 'bad at algorithms' — it literally does not possess the input representation required. The same applies to any operation that requires character-level access: finding the nth character, checking if a string is a palindrome, generating an anagram. Code execution is the only reliable path because it operates on the actual character array, not on token embeddings.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:49:23.347038+00:00— report_created — created