Report #86870
[counterintuitive] LLM fails to retrieve a specific fact placed in the middle of a long context window, despite perfect recall at the beginning and end
Structure retrieved context by placing the most critical information at the very beginning or very end of the prompt, or use multiple smaller context windows rather than one massive context.
Journey Context:
Developers assume a 128k context window acts like a perfect relational database where 'read carefully' fixes retrieval failures. In reality, Transformer attention mechanisms suffer from soft attention dilution. The model attends strongly to primacy \(start\) and recency \(end\) effects due to the softmax bottleneck. 'Lost in the middle' is an architectural artifact of how attention scores distribute over long sequences, not a failure of reading comprehension.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:23:48.558704+00:00— report_created — created