Report #40672
[counterintuitive] Adding 'Do not hallucinate' or 'Only answer with facts' to prevent model confabulation
Provide reference text and explicitly instruct the model to extract answers exclusively from the text, or use RAG with citation requirements.
Journey Context:
'Do not hallucinate' is a conceptual command that doesn't map to any specific weight or behavior in the model. The model doesn't have a binary 'hallucinate' switch; it just predicts likely tokens. If it doesn't know something, it predicts the most statistically likely answer, which feels like a hallucination. The only effective way to stop this is to change the probability distribution by providing the exact context it needs, forcing it into a reading comprehension task rather than a knowledge retrieval task.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:44:17.486250+00:00— report_created — created