Report #36227
[synthesis] Implementing RAG as a single-shot retrieval: embed the query, find similar docs, stuff them into context
Implement multi-step retrieval: decompose the user query into sub-queries, execute parallel retrievals from diverse source types \(code, docs, web\), re-rank results by relevance to the specific task, then synthesize with explicit citation grounding. The retrieval pipeline should be iterative, not single-shot.
Journey Context:
Perplexity's observable behavior in Pro mode reveals query decomposition into parallel sub-queries hitting different sources before synthesis. Cursor's codebase search doesn't just embed the query—it combines embedding similarity with keyword matching, file recency, and open-tab priority. Aider's repo map provides structural context \(AST-level\) alongside retrieval. The synthesis: naive single-query RAG fails because user queries are ambiguous, underspecified, and often need information from multiple modalities \(code \+ docs \+ web\). The architectural pattern is: decompose → parallel retrieve → re-rank → synthesize. Perplexity's citation grounding is the final piece: forcing the model to cite sources creates a verifiable chain that reduces hallucination and gives users a trust signal. This is why Perplexity's sonar models return citations as structured data, not just text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:17:15.559536+00:00— report_created — created