Report #55042
[synthesis] Passing user queries directly to search engines in a conversational RAG pipeline
Implement a query rewriting step that contextualizes the user's prompt against conversation history \*before\* retrieval, and execute parallel searches across heterogeneous indices \(web, academic, YouTube\) before synthesis.
Journey Context:
Naive RAG pipelines embed the user's raw query and search, which fails when the query is conversational \(e.g., 'what about their revenue?'\). Perplexity's observable API behavior and UI reveal a multi-step pipeline: first, a query rewriting model contextualizes the prompt; second, it fires parallel searches to different engines; third, a synthesis model reads all snippets and generates the answer with citations. The tradeoff is higher latency and cost per query, but the alternative—irrelevant retrieval in multi-turn conversations—destroys user trust and answer quality.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T22:52:57.063073+00:00— report_created — created