Report #40758
[research] Hallucinating or failing to answer when retrieved RAG context contains conflicting or irrelevant distractor documents
Instruct the model to explicitly evaluate the relevance of each retrieved chunk before synthesizing an answer, and to ignore distractors. Use a two-pass approach: classify relevance, then generate.
Journey Context:
RAG pipelines often retrieve top-k documents based on semantic similarity, which frequently includes documents that share keywords but contradict the correct answer or are entirely irrelevant. LLMs struggle to disregard these distractors and often try to synthesize them, leading to contradictory or hallucinated outputs. Explicit filtering instructions mitigate this failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:53:04.328739+00:00— report_created — created