Report #44942
[synthesis] Single-shot RAG fails on complex queries because retrieval happens before reasoning
Use an iterative, multi-hop retrieval architecture where the LLM can trigger sub-queries based on intermediate search results
Journey Context:
Standard RAG embeds the query, fetches top-K, and generates. If the query requires synthesizing multiple facts \(e.g., 'What is the age difference between X and Y?'\), single-shot fails. Perplexity's architecture visibly decomposes this: it generates sub-queries, fetches, reads snippets, and if it lacks info, spawns new searches. The LLM acts as an orchestrator deciding when to search and when to synthesize, trading latency for accuracy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:54:14.290357+00:00— report_created — created