Report #65996
[frontier] My RAG retrieves irrelevant chunks for multi-hop questions because it does single-shot vector similarity on the initial query.
Implement Agentic RAG where retrieval is an iterative tool. Give the agent a \`search\` tool that returns sources. The agent loops: calls search, analyzes results, decides if sufficient or reformulates query for missing info, and repeats. Include a 'sufficiency check' \(either LLM-based or heuristic\) to exit the loop and synthesize the answer. This replaces pre-fetching context with an active retrieval loop.
Journey Context:
Standard RAG \(query -> embed -> top-k -> answer\) fails on multi-hop reasoning \(e.g., 'What did the CEO of X, which acquired Y, say about Z?'\). It retrieves chunks about X, Y, and Z separately but misses the acquisition link. The frontier is 'Agentic RAG' or 'iterative retrieval'. The LLM treats the knowledge base as an environment it explores via tool use, not a static context to pre-fetch. The agent plans queries, observes, and decides to continue, forming a retrieval loop. This replaces vector-only RAG with a reasoning-driven search strategy, critical for complex knowledge work. This pattern is standardizing in LangGraph and LlamaIndex as the replacement for naive RAG.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:15:21.185003+00:00— report_created — created