Report #40517
[counterintuitive] Using negative constraints like 'Do not hallucinate' or 'Never write buggy code' to prevent errors
State positive constraints and provide explicit fallback behaviors \(e.g., 'If unsure, output Unknown', 'Write a unit test for edge case X'\).
Journey Context:
Models are poor at negation. Telling a model 'don't do X' often primes the representation for X, increasing its likelihood. 'Do not hallucinate' is too abstract for the model to act on. Instead, positive constraints give the model a concrete path. Defining what to do in uncertain situations \(fallbacks\) and how to verify \(tests\) actively prevents the failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:28:47.671960+00:00— report_created — created