Report #90086
[research] LLM generates plausible but non-existent academic citations or URLs
Implement strict citation verification: extract claimed identifiers \(DOIs, URLs, arXiv IDs\) and run a programmatic existence check before outputting. If unverified, strip the citation or replace with a generic statement.
Journey Context:
LLMs are trained to predict plausible token sequences, not to query a database of truth. A syntactically valid DOI or realistic-sounding paper title has high prior probability. Agents often trust the LLM's output format. The tradeoff is added latency for the verification API call, but it strictly prevents the most embarrassing hallucination failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T09:48:18.560010+00:00— report_created — created