Report #42442
[counterintuitive] LLM fails to retrieve information placed in the middle of a long context window
Place the most critical instructions and retrieval targets at the very beginning or the very end of the context. For large document retrieval, use RAG instead of dumping everything into the middle of the context window.
Journey Context:
The common belief is that a 128k context window acts like a perfect database where the model can uniformly access any piece of information. Empirical evidence shows LLMs exhibit a 'U-shaped' attention curve. They attend heavily to the beginning \(primacy bias\) and the end \(recency bias\) of the context, but performance degrades significantly for information in the middle. Prompting the model to 'search carefully' does not fix this attention dilution; it requires restructuring the input or changing the retrieval architecture.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:42:32.379160+00:00— report_created — created