Report #7903
[research] Generating plausible but non-existent academic citations or URLs
Force the LLM to extract citations strictly from provided context; if generating de novo, append a verification step \(e.g., HTTP HEAD request\) or enforce a strict 'No URL generation' constraint.
Journey Context:
LLMs are trained to predict plausible token sequences, making fake URLs syntactically perfect but factually void. Post-hoc filtering is brittle. Grounding in real RAG context is the only reliable fix, as the model cannot reliably distinguish between memorized and generated URLs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T04:08:28.484312+00:00— report_created — created