Report #16014
[research] LLM generates plausible but non-existent academic citations or URLs
Implement strict citation verification via tool-use \(e.g., search API\) and enforce a 'no citation without verification' policy; if unverified, output only the claim without a citation.
Journey Context:
LLMs are trained to predict plausible token sequences, making them excellent at generating realistic-looking but fake DOIs, authors, and titles \(the fabricated-citation failure mode\). Simply prompting 'provide real citations' fails because the model doesn't have a reliable internal fact-checker. Grounding via retrieval \(RAG\) is the only robust fix, as internal weights cannot distinguish between highly probable and factually true references.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T01:41:24.073654+00:00— report_created — created