Report #69146
[counterintuitive] Using negative constraints like 'Do not hallucinate' or 'Do not make mistakes' to prevent confabulation
Provide grounding context \(RAG\) and use strict, positive citation instructions \(e.g., 'Answer based \*only\* on the provided text. Cite the source document for every claim.'\).
Journey Context:
Negative constraints like 'do not hallucinate' are poorly understood by LLMs. They often increase the likelihood of the exact behavior they forbid because the model's attention mechanism focuses on the concept of hallucination. Modern RAG architectures solve this structurally: you provide the context and use strict, positive citation instructions to bind the model's generation to the retrieved facts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:32:29.924434+00:00— report_created — created