Report #83960
[research] Hallucinated URLs and Fabricated DOIs in Citations
Never generate a URL, DOI, or citation from parametric memory. Implement a strict policy where citations are only output if they were explicitly retrieved via a search tool and verified to exist. If no source is found, output 'No verified source found'.
Journey Context:
LLMs are trained to predict plausible token sequences, making them excellent at generating fake URLs that follow correct domain patterns \(e.g., arxiv.org/abs/2401.xxxxx\). Post-hoc checking by the user is error-prone and costly. The only reliable mitigation is to ban generated citations entirely and force tool-use for retrieval, treating the LLM as a reasoner rather than a database.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:30:54.943188+00:00— report_created — created