Report #31506
[frontier] RAG retrieving irrelevant chunks for multi-hop or comparative questions
Use Agentic RAG \(FLARE-style\): the LLM first generates a hypothetical answer or search query, retrieves documents, then explicitly checks if the retrieved context is sufficient \(e.g., 'Does this answer the user's question?'\); if not, regenerate query and re-retrieve up to N times
Journey Context:
Standard RAG embeds the user query and does cosine similarity. This fails when the query requires combining information from multiple documents. Agentic RAG \(FLARE, Active RAG\) treats retrieval as an action space. The LLM can issue search queries, read results, and decide to search again. This reduces hallucination on knowledge-intensive tasks by 30%. Alternative was HyDE \(Hypothetical Document Embeddings\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:16:09.651253+00:00— report_created — created