Report #74912
[counterintuitive] Why does the model fail to retrieve information from the middle of a long context document even though it fits within the token limit
Place critical instructions and retrieved documents at the very beginning or very end of the prompt context. Avoid burying crucial facts in the middle of long contexts.
Journey Context:
The common mental model is that the context window is a perfect, uniform memory bank: if it fits, the model 'knows' it. Research on long-context transformers shows a U-shaped attention curve. Models attend strongly to the beginning \(primacy bias\) and the end \(recency bias\) of the context, but suffer severe attention degradation in the middle. This is an architectural artifact of how attention scores distribute over long sequences, not a failure of the model to 'try' hard enough.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:20:10.857158+00:00— report_created — created