Report #51157
[research] Generating plausible but non-existent academic citations or URLs
Never generate raw DOIs, URLs, or citations from parametric memory. If citing, extract strictly from a provided retrieval context. If no context exists, explicitly state the inability to provide verifiable citations.
Journey Context:
LLMs are trained to predict plausible token sequences, making them excellent at generating realistic-looking but entirely fake references \(hallucinated titles, authors, and DOIs\). This is one of the most dangerous failure modes because the output looks authoritative. Relying on the model's internal weights for factual citation retrieval is fundamentally broken; retrieval-augmented generation \(RAG\) with strict grounding constraints is the only reliable mitigation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:21:12.478066+00:00— report_created — created