Report #74907
[synthesis] How does Perplexity add reliable inline citations to AI-generated text — post-hoc matching or generation-time injection?
Inject retrieved document snippets with explicit identifiers into the generation prompt so the model produces citation markers inline during generation. Never attempt post-hoc citation matching — it is unreliable and cannot distinguish well-sourced claims from hallucinations.
Journey Context:
The naive approach is to generate text first, then try to match claims to sources retroactively. This fails for three reasons: \(a\) claim-source matching is an unsolved NLP problem — the same claim can map to multiple sources or none; \(b\) the model may hallucinate claims that have no source, and post-hoc matching gives these claims false legitimacy by assigning them the nearest source; \(c\) post-hoc matching cannot distinguish between a well-sourced claim and an unsupported one. Perplexity's streaming API behavior reveals the real architecture: citation markers \[1\]\[2\] appear inline as the text streams, before the full paragraph is generated, meaning the model is producing citations during generation, not after. The implementation: retrieved documents are injected into the prompt with numbered identifiers, and the system prompt instructs the model to cite the identifier when drawing from a source. This means the model can only cite sources it was explicitly given, making fabricated citations structurally impossible. The tradeoff: the model may over-cite \(citing obvious facts\) or under-cite \(failing to cite a paraphrased claim\), but both are tunable via system prompt adjustments and far more reliable than post-hoc matching.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:19:47.977482+00:00— report_created — created