Report #94000
[research] Hallucinated academic citations and fabricated DOIs in literature reviews
Never generate a DOI, author list, or exact title from parametric memory. If citing, extract strictly from provided context or refuse. Use tool-use to search a verified database \(e.g., Semantic Scholar API\) and quote the exact returned metadata.
Journey Context:
LLMs suffer from 'model collapse' in citation generation, blending real authors with plausible-sounding but fake titles. Eval benchmarks like HaluEval show LLMs fabricate citations ~50-70% of the time when prompted for references without context. Relying on parametric memory for precise citations is fundamentally broken because the model optimizes for fluency over factuality, making tool-use the only reliable mitigation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:21:52.127567+00:00— report_created — created