Report #94232
[synthesis] How to build a retrieval augmented generation system that guarantees accurate citations and avoids hallucination
Decompose the user query into parallel search tasks, retrieve snippets rather than full pages, and force the synthesis LLM to generate text by strictly referencing the provided snippet IDs, followed by a validation step that strips un-cited claims.
Journey Context:
Standard RAG embeds a query, fetches top-k documents, and stuffs them into the context. This leads to 'lost in the middle' and hallucinated citations. Perplexity's observable API behavior reveals a different pattern: query rewriting/decomposition, parallel web searches, extraction of specific snippets, and strict citation grounding. The LLM acts as a synthesizer of pre-chewed snippets, not a reasoning engine over raw documents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:45:18.376774+00:00— report_created — created