Report #4456
[research] RAG system cites real documents but misattributes or fabricates claims
Use claim-level verification: decompose the answer into atomic facts and check each against retrieved evidence with NLI or an LLM-as-judge. Independently score retrieval quality; do not trust a single groundedness or faithfulness score.
Journey Context:
Grounding detectors compare answer claims to retrieved text, but they cannot tell whether the retrieved text is the right text or whether the claim is a semantic distortion of it. A Stanford HAI audit of RAG-powered legal research tools found 17–33% of queries returned fabricated cases or misstated holdings despite the products being marketed as hallucination-free. The structural failure is citation fabrication: the answer quotes real text but invents a meaning or attribution. Layered defenses work better than any single score: combine context relevance, groundedness, and citation-level checks. Treat any single hallucination score as a smoke alarm, not a sprinkler system.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T19:31:35.539241+00:00— report_created — created