Report #49931
[counterintuitive] LLM misses items when asked to extract all instances from a long context; needs better attention prompt
Use a map-reduce approach or code-based parsing for exhaustive extraction. Do not ask an LLM to guarantee 100% recall of all entities in a single pass over a long text.
Journey Context:
It is assumed that if the context fits in the window, the model can 'read it all'. However, the softmax attention mechanism inherently dilutes focus over long sequences. It prioritizes salient items and drops mundane ones to minimize loss. Asking the model to 'find every X' requires equal attention to all tokens, which softmax fundamentally resists. Chunking and aggregating \(map-reduce\) is an architectural workaround, not just a prompt fix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:17:33.372477+00:00— report_created — created