Report #6222
[research] Generating plausible but non-existent academic citations or URLs
Never generate a citation URL, DOI, or paper title from parametric memory. Only output citations if they are directly extracted from provided context, and append verbatim source snippets to prove grounding.
Journey Context:
LLMs are generative models trained to predict likely token sequences. A realistic-looking DOI or URL is statistically likely but factually ungrounded. Relying on the model to 'remember' a URL almost always results in a 404 or fabricated metadata. Grounding strictly in retrieved text with verbatim quotes is the only reliable mitigation against citation confabulation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T23:36:32.627577+00:00— report_created — created