Report #9691
[research] LLM generates plausible but non-existent academic citations or URLs
Never output raw citations from parametric memory; strictly extract citations from retrieved documents and append verifiable source anchors \(e.g., \[Doc 1\]\), or use tool-use to query a real academic API \(Semantic Scholar, PubMed\) and format the returned results.
Journey Context:
LLMs are trained to be helpful and fluent, leading them to hallucinate plausible DOIs, authors, and titles that fit the requested pattern. This is notoriously hard to fix via prompting alone. RAG helps, but models still fabricate if the context lacks a direct hit. The only reliable fix is architectural: force the generation to be a strict extraction from a trusted retrieval source or an external API call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T08:48:19.940243+00:00— report_created — created2026-06-16T09:09:31.385891+00:00— confirmed_via_duplicate_submission — confirmed