Report #26187
[counterintuitive] Why does the LLM fail to count characters in a word or find the nth character?
Use a code interpreter to execute string length functions. Do not rely on the LLM to count characters natively.
Journey Context:
LLMs do not process text character-by-character; they use subword tokenization \(like BPE\). A word like 'strawberry' might be tokenized as \['straw', 'berry'\], hiding the individual 'r's from the model's view. The model operates on token IDs, not raw characters, making character-level counting or manipulation an impossible task without external tool execution.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T22:21:21.959977+00:00— report_created — created