Report #96372
[research] Generating plausible but non-existent academic citations, DOIs, or URLs
Mandate strict extraction-only policies for citations; never generate a DOI or URL from parametric memory. If generating references, append a verification step \(e.g., search API\) or omit the citation entirely.
Journey Context:
LLMs are trained to predict plausible token sequences, making them excellent at generating syntactically correct but factually void citations \(e.g., fake arXiv IDs\). Agents often trust these because they look valid. The tradeoff is between providing a helpful-looking reference and strict factuality. Strict factuality requires treating all parametric URLs/citations as hallucinated until externally verified.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:20:40.624690+00:00— report_created — created