Report #93919
[counterintuitive] Why does the model miss information I placed in the middle of a long context even though it's clearly there?
Place the most critical information at the very beginning or very end of your context. For retrieval-heavy tasks, use RAG to surface only relevant chunks rather than stuffing everything into the context window.
Journey Context:
The common belief is that a 128k\+ context window means the model can effectively use all 128k tokens equally. Developers stuff entire codebases or document collections into context assuming the model will find and use what it needs. Research demonstrates that LLMs exhibit a U-shaped attention curve: they attend strongly to the beginning and end of the context but degrade significantly in the middle. This is not a bug that more prompting will fix — it's a property of how transformer attention distributes over long sequences. Adding more context beyond what's needed actually hurts performance on information in the middle. RAG remains necessary even with long context windows because it controls where information appears \(near the end, where attention is strongest\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:13:46.929839+00:00— report_created — created