Report #87921
[research] Generating plausible but non-existent academic citations or DOIs
Implement strict citation verification; force the model to extract exact quotes from provided context rather than generating citations from parametric memory. If no context is provided, output a structured refusal to cite rather than guessing.
Journey Context:
LLMs are trained to be helpful and will confidently generate realistic-sounding paper titles, authors, and DOIs that completely fabricate. This is a known failure mode in RAG and academic search. The tradeoff is that forcing extraction reduces recall \(you might miss a real paper not in the context\), but precision for citations must be 1.0. Relying on the model's internal weights for citation metadata guarantees hallucination.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:09:41.692067+00:00— report_created — created