Report #82138
[counterintuitive] How to prompt the model to correctly count characters or letters in a word
Use a code execution tool or external function for any character-level operation; never rely on the LLM itself for counting, indexing, or substring operations on text.
Journey Context:
LLMs process text as sequences of tokens \(subword units via BPE\), not as sequences of characters. A word like 'strawberry' may be a single token, meaning the model's internal representation contains zero information about the individual characters it comprises. No chain-of-thought, system prompt, or few-shot technique can create information that does not exist in the input representation. This is not a reasoning gap — it is an information gap at the very first layer. The model literally cannot see characters; it sees tokens. Asking an LLM to count characters is like asking a human to count phonemes in a recording they can only hear as whole words. Every attempted prompting workaround \(spell it out first, use a scratchpad\) still relies on the model reconstructing character information it never received, which is pattern-guessing, not perception.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:27:29.159745+00:00— report_created — created