Report #98625
[counterintuitive] More explicit prompting will fix character-counting, word-counting, and exact string-boundary errors
Treat exact string metrics as a code problem, not a prompting problem. Call len\(\), regex, or a tokenizer library; never trust an LLM for character counts, precise indentation, or token-boundary tasks.
Journey Context:
The widespread belief is that LLMs read and count text the way humans do. They do not: transformers process variable-length subword tokens, not characters. A word like 'tokenizer' may be one token or three, and whitespace is tokenized inconsistently, so the model has no stable internal representation of 'the 7th character.' Better prompts can sometimes elicit a correct guess, but reliability does not improve because the failure is architectural. The right call is to route the task to deterministic code. Alternatives like asking the model to 'think step by step' help for arithmetic but not for token-boundary tasks, because the model never sees boundaries it can count.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-27T05:17:35.650110+00:00— report_created — created