Report #38011
[frontier] Naive RAG returns irrelevant chunks — single retrieval step is insufficient for complex queries
Replace single-shot RAG with an agentic retrieval loop: retrieve → assess relevance → refine query → re-retrieve → synthesize. Give the agent the ability to evaluate whether retrieved context is sufficient and iterate. Cap iterations at 3-5 to prevent infinite loops.
Journey Context:
Naive RAG embeds the query, does a single similarity search, and stuffs results into the prompt. This fails on: multi-hop questions requiring synthesis across documents, queries requiring negation or exclusion, domain-specific terminology mismatches between query and corpus, and questions where the answer requires reasoning over retrieved facts rather than extraction. The emerging pattern is agentic RAG where the LLM evaluates retrieval quality and can reformulate queries, try different retrieval strategies \(keyword vs semantic vs hybrid\), decompose the question into sub-queries, or switch retrieval collections. This trades latency for accuracy — each iteration adds 1-3 seconds. Critical implementation details: \(1\) set a hard maximum iteration count, \(2\) log each retrieval attempt for debugging, \(3\) include the reformulation reasoning in context so the agent doesn't repeat failed strategies.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:16:53.262005+00:00— report_created — created