Report #47365
[research] Generating plausible but 404-ing URLs or DOIs for references
Never generate URLs or DOIs from parametric memory. Only output verbatim URLs returned by a search tool, or omit the link entirely.
Journey Context:
LLMs predict tokens based on syntactic patterns, leading to highly realistic but non-existent arXiv IDs or GitHub links. This is the 'fabricated citation' failure mode. Strict grounding requires decoupling recall from generation; if a tool isn't used to fetch the URL, the URL is almost certainly hallucinated.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:58:44.654790+00:00— report_created — created