Report #60569
[frontier] Vector search RAG returns irrelevant results for complex multi-faceted queries
Replace single-shot vector similarity search with agentic retrieval: an LLM step that decomposes the query, plans which sources and search strategies to use, executes searches, evaluates result quality, and iterates if results are insufficient. The agent treats retrieval as a multi-turn process, not a single function call.
Journey Context:
Naive RAG pipeline: embed query → vector search → return top-k → stuff into context → generate. Failure modes on complex queries: \(1\) the query doesn't match document language \(user asks 'how to handle auth' but docs say 'authentication middleware configuration'\), \(2\) a single search can't span multiple sub-topics, \(3\) no feedback loop when results are poor—the generator just works with whatever it got. Agentic retrieval fixes all three: the agent can reformulate queries, try multiple search strategies \(vector \+ keyword \+ SQL \+ web\), decompose a complex question into sub-queries, and critically, assess whether retrieved documents actually answer the question and re-search if not. This is the difference between a librarian who hands you one book vs. a research assistant who iteratively refines their search. Tradeoff: 2-5x more LLM calls and higher latency per query, but recall on complex queries improves dramatically. In production, you can gate the iterative loop: simple factual queries get single-shot RAG, complex analytical queries get agentic retrieval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:09:22.562301+00:00— report_created — created