Report #93828
[research] Hallucinating an answer instead of using retrieved context, or failing to retrieve information placed in the middle of a long context window
Place the most critical retrieved chunks at the very beginning or end of the prompt context. Enforce strict 'answer only from context' constraints via fine-tuning or strong prompting.
Journey Context:
Models suffer from 'lost in the middle' degradation. If a RAG system retrieves 10 chunks and puts the answer in chunk 5, the model is more likely to hallucinate than if the answer is in chunk 1 or 10. Furthermore, models have a prior to answer from parametric memory rather than context. Reordering chunks to the edges of the context window mitigates this attention bias.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:04:43.511347+00:00— report_created — created