Report #12394
[research] LLM incorporates irrelevant or misleading information from retrieved documents \(distractors\) into its final answer, reducing factuality
Implement a relevance filtering step \(e.g., cross-encoder reranking or an LLM-based 'is this relevant?' classifier\) between retrieval and generation. Limit context window injection to top-k highly relevant chunks rather than stuffing the prompt.
Journey Context:
Naive RAG pipelines retrieve top-k documents and stuff them into the prompt. LLMs are highly susceptible to 'lost in the middle' effects and distractor contamination; they will synthesize an answer using the most recently read text, even if it's irrelevant. Reranking and strict top-k truncation \(e.g., top 3 instead of top 10\) reduces noise. The tradeoff is that if the reranker is wrong, the generator never sees the correct document, but this is generally safer than context stuffing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T15:50:57.297390+00:00— report_created — created