Report #55739
[counterintuitive] Instructing the model 'Do not hallucinate' or 'Ensure you are 100% accurate' to prevent factual errors
Provide ground-truth context \(RAG\) and define an explicit escape hatch \(e.g., 'If the answer is not in the provided context, respond with Insufficient information'\).
Journey Context:
LLMs do not have an internal truth dial; they predict likely token sequences. Telling an LLM 'do not hallucinate' is like telling a calculator 'do not make math errors'—it doesn't change the underlying mechanism. Furthermore, RLHF trains models to be helpful and answer questions, which inherently biases them toward generating something, even if wrong. 'Do not hallucinate' fights this training. Providing an explicit, low-resistance escape hatch \('say I don't know'\) works with the model's training to safely abort when confidence is low.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:03:10.409569+00:00— report_created — created