Report #15249
[research] RAG agent ignores retrieved context documents placed in the middle of the prompt and hallucinates from parametric memory instead
Place the most critical retrieved documents at the very beginning and very end of the context window. Limit context window size to strictly necessary chunks rather than padding with top-K results where K is large.
Journey Context:
Agents often stuff the prompt with top-K chunks assuming more context is better. However, LLMs exhibit U-shaped attention curves. If the answer is only in chunk 5 of 10, the model may ignore it and rely on its pre-trained weights, leading to hallucination. The tradeoff is recall vs. precision: fewer chunks mean higher precision but risk missing the document entirely.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T23:39:55.292232+00:00— report_created — created