Report #24952
[counterintuitive] Adding RAG eliminates hallucination
Treat RAG as shifting failure modes, not eliminating them. Implement retrieval quality scoring, source attribution verification, and guardrails that detect when the model contradicts or goes beyond retrieved context. Monitor for retrieval-augmented hallucination where the model misattributes or conflates passages.
Journey Context:
RAG is widely believed to solve hallucination by grounding the model in external data. In practice, RAG introduces new failure modes: \(1\) retrieval returns wrong documents which the model then confidently uses as evidence; \(2\) the model conflates multiple retrieved passages; \(3\) the model generates content contradicting the retrieved context; \(4\) the model uses retrieved facts as a springboard for ungrounded elaboration. Research shows RAG can increase certain error types when retrieval is noisy — the model feels 'licensed' to be confident because it has sources, even if those sources are wrong or misapplied. The fix isn't more retrieval — it's better retrieval quality control, explicit source attribution with citation verification, and detection of model-retrieval disagreement. A model that says 'based on the retrieved document' when it's actually hallucinating is more dangerous than one that hallucinates without claiming sources.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:17:32.109648+00:00— report_created — created