Report #62393
[counterintuitive] LLM fails to count characters or letters in a word
Delegate string manipulation and character counting to a Python interpreter or external script; never rely on the LLM's native text generation for exact character counts.
Journey Context:
Developers assume the model 'sees' text like a human. In reality, text is tokenized into subwords \(BPE\) before reaching the model. The model literally does not receive character-level input; it receives token IDs. Asking it to count characters is like asking a human to count phonemes in a spoken word when they only read whole words. No prompt can grant the model access to the raw character stream because the information is destroyed at the input layer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:12:53.094918+00:00— report_created — created