Report #50248
[counterintuitive] Using 'Do not hallucinate' or 'Do not make assumptions' to prevent model confabulation
Define explicit, positive fallback behaviors for missing knowledge \(e.g., 'If the function signature is not in the context, return UNKNOWN'\).
Journey Context:
Models struggle with negation. 'Don't hallucinate' is an abstract concept that doesn't map to specific token probabilities; it often paradoxically increases hallucination by priming the model on the concept of hallucination. The fix is positive, actionable fallback instructions that give the model a valid path when it lacks information.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:49:32.728042+00:00— report_created — created