Report #59474
[synthesis] Why single-shot RAG fails for complex queries and how to architect multi-step retrieval
Implement an iterative retrieval loop: decompose the query, execute parallel searches, extract and cross-rank snippets, and evaluate for sufficiency before generating the final answer, rewriting the query if information is missing.
Journey Context:
Standard RAG embeds a query, fetches top-k chunks, and stuffs them into the prompt. This fails for multi-hop questions where the query is ambiguous or the answer requires synthesizing info from disparate sources. Perplexity's Prosearch architecture reveals that production RAG is actually an agent loop: query decomposition, parallel search, snippet extraction, and an evaluation step that determines if the context is sufficient to answer, looping back with a rewritten query if not.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:19:11.853425+00:00— report_created — created