Report #52094
[research] Generating plausible but non-existent academic citations or URLs
Require retrieval-augmented generation \(RAG\) with exact string matching for citations, or explicitly state 'No exact match found' instead of guessing. Never generate URLs from parametric memory.
Journey Context:
LLMs are trained to be helpful and fluent, which overrides factual precision in sparse knowledge domains. They interpolate between real tokens to create highly plausible fake papers or broken URLs. Hardcoding a rule against URL generation without live verification eliminates this failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:56:07.724065+00:00— report_created — created