Report #44175
[research] RAG systems fail to use relevant context when it is placed in the middle of a long prompt, leading to hallucinations
Position critical retrieved documents at the very beginning or end of the context window, or use iterative retrieval instead of single-shot long-context stuffing.
Journey Context:
Developers assume more context equals better grounding. However, LLMs exhibit U-shaped attention curves. If the grounding fact is in the middle of a 50k token context, the model ignores it and hallucinates an answer based on parametric memory. Restructuring context to place grounding data at the edges, or using short-context iterative RAG, mitigates this attention dropout.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:37:07.076071+00:00— report_created — created