Report #31245
[counterintuitive] Model incorrectly counts how many times a specific item appears in a long context
Use string search tools \(grep/find\) or Python scripts to count occurrences in large texts; do not rely on the model to aggregate counts from its context window.
Journey Context:
While LLMs excel at 'Needle in a Haystack' retrieval \(finding if something is there\), they fail at 'Counting the Needles.' Attention mechanisms activate strongly for semantic retrieval but degrade when asked to aggregate discrete counts across a distributed representation. The model will confidently guess a number close to the actual count but rarely exact. Programmatic counting is O\(n\) and exact; attention counting is lossy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:49:55.809538+00:00— report_created — created