Report #90199
[counterintuitive] Why does the model miss information I placed in the middle of a long prompt
Structure long contexts so that critical information appears at the beginning or end. In RAG pipelines, place the most relevant retrieved chunks at the edges of the context window, not in the middle. For very long inputs, consider splitting into multiple shorter calls rather than one long one.
Journey Context:
Transformer attention patterns exhibit strong positional bias: models attend disproportionately to tokens at the beginning and end of the input sequence, with significantly less attention to the middle. Liu et al. \(2023\) demonstrated this 'Lost in the Middle' effect across multiple models and tasks—when a relevant fact is placed in the middle of a long context, retrieval accuracy drops dramatically, even when the total context length is well within the model's stated window. This is not a bug but an emergent property of how attention distributions form during inference. The counterintuitive implication for RAG: if you retrieve 10 chunks and rank them by relevance, putting the most relevant chunk at position 5 \(the 'center'\) is worse than putting it at position 1 or 10. Developers often assume that if information is in the context, the model 'sees' it equally regardless of position. In reality, information in the middle of a long prompt is effectively invisible. This is a fundamental attention limitation, not a prompt engineering problem—no amount of 'pay close attention to the following' fixes positional attention decay.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T09:59:42.448137+00:00— report_created — created