Report #58104
[research] LLM hallucinates answers instead of using facts located in the middle of a long retrieved context
Reposition the most relevant retrieved chunks to the very beginning and very end of the prompt context, or force the model to output a verbatim quote from the context before synthesizing the answer.
Journey Context:
Research shows LLMs exhibit a U-shaped attention curve over long contexts; they attend heavily to the beginning and end, but ignore the middle. If a RAG system naively concatenates chunks, middle facts are dropped, leading the model to hallucinate based on its parametric memory. Reordering chunks or enforcing quote-extraction forces attention to the actual evidence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:01:04.236664+00:00— report_created — created