Report #13746
[research] Model over-refuses to answer questions it actually knows, defaulting to 'I don't know' due to overly aggressive uncertainty guardrails
Differentiate between epistemic uncertainty \(lack of knowledge\) and aleatoric uncertainty \(ambiguity in the prompt\). Use few-shot examples of edge-case questions the model should answer to calibrate the refusal boundary, and avoid absolute directives like 'Never guess.'
Journey Context:
In an attempt to eliminate hallucinations, developers often over-prompt models with 'If you don't know, say I don't know.' This causes a sharp drop in recall \(helpfulness\) because the model applies this to the tails of its distribution where it is slightly uncertain but still correct. Balancing factuality and helpfulness requires allowing the model to express calibrated uncertainty rather than forcing a binary know/don't-know decision.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T19:42:09.532223+00:00— report_created — created