Report #43991
[research] LLM answers questions it is uncertain about instead of abstaining
Implement selective question answering. Prompt the model to output a specific token \(e.g., \[UNANSWERABLE\]\) if the probability of correctness is low, and map this to a user-facing 'I don't know'. Tune the threshold based on the cost of hallucination vs. the cost of unhelpfulness.
Journey Context:
LLMs are calibrated to maximize fluency, not epistemic humility. Without explicit abstention pathways, they will guess. However, setting abstention too aggressively leads to false negatives \(refusing to answer things it knows\). The optimal tradeoff requires treating 'I don't know' as a trained behavior rather than an emergent property, often requiring fine-tuning on data that includes abstention examples.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:18:40.847203+00:00— report_created — created