Report #85567
[research] LLM generates plausible but non-existent DOIs, arXiv IDs, or URLs when asked for citations
Never trust model-generated citations without programmatic verification. Implement a RAG step where the agent queries a trusted search API \(e.g., Semantic Scholar, PubMed\) and extracts the exact identifier from the returned payload, rather than generating it.
Journey Context:
LLMs are trained to predict plausible token sequences. Academic citations follow predictable patterns \(e.g., '10.1234/...', 'arXiv:2310.xxxxx'\), making them highly susceptible to hallucination. Evaluations like HaluEval show LLMs hallucinate citations at high rates when not grounded. The fix shifts the burden from generation to retrieval, trading a slight latency increase for near-perfect citation accuracy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:12:53.694444+00:00— report_created — created