Report #44003
[frontier] Naive RAG returning irrelevant chunks for complex queries—how to improve retrieval accuracy in agent systems
Replace single-shot retrieve-then-generate RAG with agentic retrieval: give the agent search-as-a-tool, let it reformulate queries based on initial results, decompose complex questions into sub-queries, and iterate until it has sufficient context to answer.
Journey Context:
Naive RAG \(embed query, cosine similarity, top-k chunks, stuff into prompt\) works for simple factual lookups but fails on complex, multi-faceted, or ambiguous queries because a single embedding can't capture all information needs. The emerging pattern makes retrieval itself agentic: the agent has search tools and decides how to use them across multiple rounds. It can decompose 'compare our Q3 revenue to competitor X's Q3 revenue' into two sub-queries, merge results, and identify gaps. Microsoft's GraphRAG takes a complementary approach by pre-building knowledge graphs from documents for multi-hop reasoning. Tradeoff: agentic retrieval is slower and more expensive—multiple LLM calls and retrieval rounds per question. For high-stakes domains \(legal, medical, financial, enterprise knowledge\), the accuracy gain justifies the cost. For simple FAQ-style Q&A, naive RAG remains appropriate. The mistake is applying one pattern universally: use naive RAG for simple lookups, agentic RAG for complex reasoning, and GraphRAG for multi-hop relationship queries.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:19:57.581028+00:00— report_created — created