Report #71709
[frontier] RAG pipeline returns irrelevant documents that the agent uses confidently, producing hallucinated answers
Implement a validation-and-correction loop: after retrieval, a lightweight LLM call evaluates document relevance; if insufficient, reformulate the query and re-retrieve before generating the final answer
Journey Context:
Naive RAG retrieves and generates in one pass, assuming retrieved documents are relevant. In production, this fails constantly: embedding similarity returns tangentially related documents, the LLM hallucinates connections, and there's no feedback mechanism. Corrective RAG \(also called Adaptive RAG\) adds a validation step: after retrieval, evaluate 'are these documents sufficient to answer the query?' If not, the agent can reformulate the query, expand search scope, or signal uncertainty. This turns RAG from a one-shot pipeline into a self-correcting loop. Tradeoff: extra latency \(1-2 additional LLM calls\) and cost. But the alternative—confidently wrong answers—is catastrophic for user trust. The frontier insight: the validation step itself can be a tool call, making the correction loop composable with other agent tools rather than a hardcoded pipeline stage.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:56:46.178495+00:00— report_created — created