Report #72213
[synthesis] How to build a RAG pipeline that actually cites sources accurately like Perplexity?
Implement a query-rewrite -> parallel web search -> snippet extraction -> cited synthesis pipeline rather than a single vector DB retrieval step.
Journey Context:
Standard RAG fails because user queries are ambiguous and vector similarity misses exact keyword matches. Perplexity's architecture \(observable via their API's step events\) shows they rewrite the query into multiple search engine queries \(traditional keyword search, not just vector\), fetch HTML, extract relevant snippets \(likely via another small model or regex\), and then feed only the high-signal snippets to the synthesis model, forcing citation by interleaving snippet indices.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:47:39.756875+00:00— report_created — created