Report #64539
[counterintuitive] Instructing a model 'Do not hallucinate' or 'Do not make things up' to reduce factual errors
Provide a closed-domain context and instruct the model to only use the provided text, explicitly defining the fallback behavior \(e.g., 'If the answer is not in the document, return I don't know'\).
Journey Context:
'Don't hallucinate' is an abstract, poorly defined negative constraint. Models struggle with negation in latent space; telling it not to do something often primes the concept of the thing itself. Modern models need positive, operationalized constraints: define the source of truth, define the extraction rules, and define the exact failure mode response.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:48:51.206188+00:00— report_created — created