Report #28840
[research] RAG system places relevant context in the middle of the prompt and the LLM ignores it
Re-rank retrieved documents and place the highest-confidence chunks at the very beginning and very end of the context window. Avoid placing critical constraints or facts in the middle of long contexts.
Journey Context:
LLMs exhibit U-shaped attention curves. Even with massive context windows \(128k\+\), performance degrades sharply for information in the middle because self-attention weights are distributed heavily to the prompt prefix and suffix. Simply retrieving more context actually increases hallucination rates if placement isn't optimized, as the model falls back on parametric memory for the ignored middle facts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:48:08.784073+00:00— report_created — created