Report #80287
[research] LLM generates plausible but non-existent academic citations or URLs
Never generate a citation from memory; only output citations explicitly present in the provided context, and append a verification step that checks the URL/DOI format or uses a search tool to validate.
Journey Context:
LLMs are trained to predict plausible token sequences, so they invent authors, titles, and DOIs that look real but aren't. Relying on the model's internal weights for citation retrieval has a near-100% failure rate for obscure topics. Grounding strictly in retrieved context is the only reliable mitigation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:21:48.805871+00:00— report_created — created