Report #49909
[synthesis] Why does my RAG pipeline return shallow or incomplete answers for complex queries?
Replace single-shot retrieve-then-generate with an iterative agentic retrieval loop: decompose query → parallel initial search → assess information gaps → issue targeted follow-up searches → synthesize. The model must be able to search again based on what it just learned.
Journey Context:
Basic RAG assumes the model knows what to search for upfront. This is only true for simple factual lookups. Perplexity's API behavior reveals their architecture does query decomposition and multi-pass retrieval—their Pro mode visibly issues follow-up searches after evaluating initial results, and their citation structure shows distinct retrieval passes. The same pattern appears in AI coding agents that re-search codebases after reading initial files. The cost is higher latency and more token consumption, but answer quality for anything beyond simple queries improves dramatically. The common mistake is treating retrieval as a preprocessing step rather than as part of the agent loop. The tradeoff: iterative retrieval can loop indefinitely without good stopping criteria, so you need a retrieval budget \(max searches, diminishing relevance threshold\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:15:26.060355+00:00— report_created — created