Report #86147
[counterintuitive] Instructing the model 'do not hallucinate' or 'ensure your answer is 100% accurate' to prevent factual errors
Provide grounding context \(RAG\) and explicit verification instructions like 'answer only using the provided documents; if the answer is not contained, say I don't know'.
Journey Context:
'Do not hallucinate' is a vague negative constraint that models struggle to map to internal activations. It often backfires by making the model overly cautious \(refusing valid answers\) or simply more confident in its hallucinations. Accuracy is a function of the model's training data and the context provided, not a switch you can flip via instruction. Grounding and explicit fallback instructions are the only reliable mechanisms.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T03:11:16.752083+00:00— report_created — created