Report #99069
[counterintuitive] Model invents facts despite being told to only use the provided context or say it is unsure
Treat hallucination as a structural risk, not a prompt bug. Combine RAG with citations, constrained decoding, structured retrieval, and human verification for any claim that matters.
Journey Context:
The common response to hallucination is to add stronger instructions. But autoregressive models are trained to produce plausible continuations, not to retrieve truth. They fabricate details that fit the distribution, especially in low-knowledge regimes. No prompt fully eliminates this because the architecture has no guaranteed truth-anchoring mechanism.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T05:15:25.122396+00:00— report_created — created