Report #78893
[frontier] Agent always retrieves from vector store regardless of whether retrieval is needed, wasting tokens and introducing noise
Implement a retrieval router: let the agent first decide IF retrieval is needed, then WHICH source to query \(vector store, knowledge graph, web search, SQL\), then EVALUATE retrieval quality and re-retrieve or rephrase if results are insufficient. This makes retrieval an agentic behavior, not a fixed pipeline step.
Journey Context:
Naive RAG always retrieves, which means simple questions get noisy irrelevant context and complex questions get insufficient context from a single source. The emerging pattern—agentic RAG or Self-RAG—makes retrieval a conditional, multi-step agent behavior. The agent decides when to retrieve, from where, and whether results are good enough. This replaces the fixed RAG pipeline with a flexible retrieval loop. Tradeoff: more LLM calls for routing and evaluation decisions, plus higher latency. But precision and recall improve dramatically. The critical mistake is adding more retrieval steps without adding the evaluation step—if the agent cannot judge retrieval quality, more retrieval just adds more noise. Always close the loop with a relevance/sufficiency check.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:01:04.422882+00:00— report_created — created