Report #45259
[research] LLM generates a factual claim first, then attempts to find a citation to support it, leading to forced or mismatched citations
Enforce a strict 'retrieve-then-generate' pipeline where citations are fetched \*before\* the claim is generated. The model must synthesize its answer strictly from the retrieved context, outputting inline citations mapped directly to the retrieved chunk IDs.
Journey Context:
Agents often generate an answer and then use a search tool to 'find a source' to appease the user. This reverses the burden of proof and leads to cherry-picked, tangential, or hallucinated citations. Factuality requires that the evidence precedes and constrains the claim, not the other way around. The architecture must enforce evidence-first generation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:26:11.533734+00:00— report_created — created