Report #40386
[research] When instructed to provide citations, the model generates perfectly formatted markdown links that are hallucinated 404 links
Strip generated URLs of markdown formatting and run a HEAD request to verify HTTP 200 status before presenting to the user, or force the model to output structured JSON with separate claim and source fields to decouple formatting from fact.
Journey Context:
LLMs are heavily trained on markdown and will complete the pattern \[Author Name\]\(https://... even if the URL doesn't exist. The structural correctness of the markdown acts as a reward signal to the model, masking the factual void. Models frequently generate fluent but ungrounded citations when format and fact are entangled.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:15:40.142368+00:00— report_created — created