Report #84421
[research] Refusing to answer easy questions due to overly aggressive anti-hallucination prompting \(Lazy Refusal\)
Require the model to output its reasoning or chain-of-thought \*before\* deciding to refuse, so it realizes it actually knows the answer; calibrate uncertainty thresholds.
Journey Context:
Telling an agent 'if you don't know, say I don't know' often causes a spike in false negatives \(refusals on factual queries it actually knows\). Chain-of-thought before refusal allows the model to access its internal knowledge before evaluating its own certainty, significantly reducing lazy refusals while maintaining safety.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:17:40.945395+00:00— report_created — created