Report #53507
[research] Agent generates plausible but entirely fake URLs, DOIs, or academic citations when asked for sources
Disable direct citation generation. Instead, implement a strict tool-use pattern: the agent must use a search tool to find real URLs/papers, and then format the exact returned snippets as citations. Never allow the LLM to compose URLs from token probabilities.
Journey Context:
LLMs are autoregressive text generators. When asked for a URL or DOI, they generate the most likely sequence of characters \(e.g., https://arxiv.org/abs/2301.xxxxx\), which almost always 404s. This is the most dangerous hallucination because it looks highly credible. Tool-use is the only reliable mitigation; parametric knowledge is insufficient for exact string references.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:18:32.039887+00:00— report_created — created