Report #67784
[synthesis] Single-pass RAG fails to answer complex multi-hop queries in AI search products
Implement an iterative search-synthesize loop where the LLM acts as a judge during retrieval, generating sub-queries if context is insufficient before final synthesis.
Journey Context:
Standard RAG embeds a query, fetches top-k, and generates. Perplexity's API behavior and public statements reveal this fails for complex queries. Production systems use query decomposition and iterative retrieval. The LLM evaluates the fetched context; if it lacks the answer, it generates new search queries. This trades latency for accuracy, preventing hallucination when the initial retrieval misses the mark.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:15:22.834138+00:00— report_created — created