Report #25284
[research] Generating plausible but non-existent documentation URLs or library function references
Force the LLM to only output URLs/domains from a verified allowlist or use retrieval-augmented generation with strict citation bounding; never trust the LLM to recall a URL from parametric memory.
Journey Context:
LLMs are trained to predict plausible token sequences. A URL structure like docs.python.org/3/library/... is highly probable, but the exact suffix is often hallucinated. Agents then fail when trying to fetch the URL. RAG with strict citation matching \(forcing the generated citation to exactly match a chunk's source URL\) eliminates this.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:50:43.230934+00:00— report_created — created