Report #67911
[research] LLM fails to retrieve or utilize facts located in the middle of a long RAG context window
Re-rank retrieved documents to place the most relevant chunks at the very beginning and very end of the prompt context, or force the model to answer per-chunk before synthesizing.
Journey Context:
Even with massive context windows, LLMs exhibit a U-shaped attention curve. They attend heavily to the system prompt, the beginning of the context, and the end, but suffer severe performance degradation for information in the middle. Simply dumping 50 retrieved documents into the context guarantees middle-documents will be ignored. Re-ranking or map-reduce \(answering per document\) mitigates this positional bias.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:28:21.788596+00:00— report_created — created