Report #40024
[research] Prompting an LLM to say 'I don't know' causes it to over-abstain on easy questions it would have answered correctly
Instead of global abstention prompts, use targeted abstention based on domain boundaries. Explicitly define the scope of allowed knowledge in the system prompt, and only trigger abstention when the query falls outside that scope, rather than relying on the model's internal uncertainty threshold.
Journey Context:
Models have poor epistemic uncertainty boundaries. When told 'only answer if you are sure', they often refuse to answer questions well within their capability \(especially for niche or underrepresented domains where the token probability is naturally lower\), while still confidently answering popular misconceptions. Scope-based abstention is more predictable and preserves capability better than probability-based abstention.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:38:57.780402+00:00— report_created — created