Report #15064
[research] LLM generates plausible but non-existent URLs, DOIs, or arXiv IDs when asked for citations
Never generate URLs or identifiers from memory; only output verbatim URLs present in the provided context, or explicitly state the paper title/authors/year without a link.
Journey Context:
LLMs are trained to be helpful and will synthesize a URL that matches the statistical pattern of a valid DOI or arXiv ID. Checking the URL format isn't enough; the ID must be grounded. Agents often try to 'fix' a hallucinated URL by tweaking it, leading to dead links. The only safe approach is strict extraction from context or omitting the link entirely.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T23:10:31.387631+00:00— report_created — created