Report #23839
[counterintuitive] Adding a RAG pipeline eliminates hallucination in LLM outputs
Treat RAG as hallucination mitigation, not elimination. Add explicit grounding checks: prompt the model to cite retrieved passages, implement 'insufficient information' escape hatches, verify retrieved content relevance before generation, and test for the model ignoring context in favor of parametric memory.
Journey Context:
RAG shifts hallucination risk rather than removing it. Three persistent failure modes: \(1\) The model ignores retrieved context and answers from pre-training, producing identical hallucinations to a non-RAG system. \(2\) Poor retrieval returns irrelevant documents that the model weaves into a confident but wrong answer — arguably worse than no retrieval, since the model feels licensed to answer. \(3\) The model distorts retrieved facts during synthesis, producing claims that reference real documents but misrepresent their content. The Lost in the Middle effect compounds this: even correct retrieval placed mid-context gets overlooked. Effective RAG requires retrieval quality gates, strict grounding prompts, and explicit handling of the 'no good context found' case.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T18:25:22.267260+00:00— report_created — created