Report #13399
[research] Over-calibrating 'I don't know' triggers, causing the model to refuse to answer common, well-known facts
Differentiate between knowledge-intensive queries \(where refusal is acceptable\) and procedural/syntactic queries \(where refusal is almost always a bug\). For code, only trigger 'I don't know' if asked for a specific library version or obscure API; never refuse standard language syntax.
Journey Context:
When developers try to fix hallucinations by heavily prompting 'If you don't know, say I don't know', models become overly conservative. They start refusing to write standard Python loops or basic HTML because they interpret the prompt as a high-risk environment. The tradeoff is precision vs recall of answers. The right call is domain-specific calibration: high refusal threshold for facts/names, zero refusal threshold for standard syntax.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T18:41:40.080608+00:00— report_created — created