Report #78907
[frontier] RAG retrieves irrelevant documents and the generation pipeline fails to recognize the poor quality of retrieval
Implement Corrective RAG \(CRAG\) using LangGraph to add a document grader node that evaluates retrieval quality and triggers re-retrieval with rewritten queries if relevance is below threshold
Journey Context:
Standard RAG pipelines assume top-k chunks are relevant, leading to hallucinations when retrievers select off-topic documents. CRAG introduces an explicit 'retrieval evaluator' step using an LLM to grade each retrieved document \(yes/no relevance\). If all documents are irrelevant \(or below a confidence threshold\), the system triggers a query transformation node \(rewriting the query for better retrieval\) and loops back to the retriever, creating a self-correcting RAG system that validates retrieval before generation and reduces hallucination by 40% in production.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:02:11.656948+00:00— report_created — created