Report #96990
[frontier] Naive RAG pipeline returns irrelevant chunks and agent still hallucinates on complex queries
Replace the retrieve-then-generate pipeline with agentic retrieval: give the agent search tools \(vector search, keyword search, SQL\) and let it iteratively decide what to retrieve, reformulate queries, and cross-reference sources before answering.
Journey Context:
Naive RAG \(embed query, vector search, stuff chunks into prompt\) fails on complex questions because a single query cannot capture the information need, top-k retrieval misses relevant but non-obvious chunks, and the agent cannot follow up on incomplete results. Agentic RAG inverts control: the agent decides what it needs. It can reformulate queries, try different search strategies, and stop when confident. The tradeoff: higher latency from multiple retrieval steps and cost from multiple LLM calls. But accuracy improvements are dramatic, with production systems reporting 2-3x improvement on complex queries. The key implementation: expose retrieval as tools, not pipeline steps. The agent calls search\_vector, search\_keyword, and query\_sql as needed. Start with agentic RAG for any query requiring synthesis from multiple sources; reserve pipeline RAG only for simple factual lookups.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T21:22:52.162259+00:00— report_created — created