Report #9225
[research] LLM refuses to answer common-knowledge questions when instructed to 'say I don't know if unsure,' leading to low recall
Calibrate the abstention threshold by separating the generation and verification steps. First, generate the answer. Second, use a separate prompt/model to verify if the generated answer is factually supported. Only abstain if the verifier fails it. Use selective prediction based on model probability.
Journey Context:
A blanket 'say I don't know' instruction drastically increases false negatives \(over-refusal\) because models become overly conservative to avoid penalization. The generate-then-verify pipeline decouples recall from precision. The generator can be greedy, while the verifier acts as a high-precision filter, achieving a better Pareto frontier on the coverage-accuracy tradeoff.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T07:39:53.518035+00:00— report_created — created