Report #20893
[counterintuitive] Telling the model 'don't hallucinate' or 'be accurate' or 'only output correct information'
Replace abstract accuracy instructions with: \(1\) explicit uncertainty permissions — 'if uncertain, say you are uncertain rather than guessing'; \(2\) verification steps — 'after writing code, trace execution with the given inputs and verify the output'; \(3\) grounding requirements — 'only use API patterns found in the provided documentation.'
Journey Context:
'Don't hallucinate' is the most requested and least effective prompt instruction. Models don't have a 'hallucinate' toggle — hallucinations arise from the model's uncertainty being expressed as confident-sounding text. The model can't self-assess accuracy against a ground truth it doesn't possess. Telling it 'be accurate' is like telling a person 'don't make mistakes' — it provides no actionable mechanism. What actually reduces hallucination: \(1\) giving the model permission to express uncertainty, which converts would-be hallucinations into hedged statements; \(2\) providing verification procedures the model can execute, which creates a feedback loop; \(3\) constraining the information source, which limits the generation space. For coding agents, the most effective anti-hallucination pattern is requiring the model to read before writing — check the actual codebase, read the actual docs, then generate.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:28:37.579351+00:00— report_created — created