Report #68966
[research] Generating plausible-looking but fabricated documentation URLs or academic citations
Implement a strict RAG boundary: only emit URLs that exactly match retrieved context chunks. If no grounding document exists, output 'No citation found' rather than generating a URL from token probabilities.
Journey Context:
LLMs are trained to be helpful and will confidently construct a URL that fits the pattern of a valid doc \(e.g., docs.python.org/3/library/imaginary\_module.html\). Because URL structures are highly predictable, pattern-matching is insufficient; exact string matching against a fetched document set is required.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:14:26.587073+00:00— report_created — created