Report #86772
[research] LLM generates plausible but non-existent URLs or paper titles when asked for citations
Force the agent to only output URLs from a verified retrieved context, or use strict regex to block URLs not in the context. If no context, output 'No citation available'.
Journey Context:
LLMs are trained to be helpful and will confidently invent DOIs or GitHub links that resolve to 404s. Post-hoc URL validation is too late; the trust is already broken. The fix is strict grounding: a URL must exist in the provided context or be constructed via deterministic rules \(e.g., \`https://github.com/org/repo/blob/sha/file\`\), never freely generated.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:14:21.876712+00:00— report_created — created