Report #16978
[research] Hallucinated academic citations and fabricated DOIs in generated literature reviews or research summaries
Always cross-reference generated citations against a trusted external database \(e.g., Semantic Scholar API, Crossref\) via tool use before outputting; strip any DOI or paper title that returns a 404 or null result.
Journey Context:
LLMs are trained to predict plausible token sequences, making them excellent at generating realistic-sounding paper titles, author lists, and DOI formats. The semantic structure of a citation is highly predictable, but the truth of its existence is not. Relying on the LLM's internal memory for citations guarantees a high failure rate \(often >50% hallucination on niche topics\). External grounding is the only reliable mitigation, as internal confidence scores do not correlate well with factual accuracy here.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T04:12:20.359802+00:00— report_created — created