Report #64542
[research] Model answers questions it is uncertain about instead of abstaining, leading to confident hallucinations
Implement selective question answering via self-consistency checks: sample multiple reasoning paths \(temperature > 0\) and abstain or say 'I don't know' if the variance of the final answers exceeds a threshold.
Journey Context:
LLMs are poorly calibrated by default; their softmax probabilities do not align well with the true probability of correctness. Simply prompting 'say I don't know if you aren't sure' causes over-abstention on hard but answerable questions, or fails to trigger on unknown domains. Self-consistency \(majority vote across N samples\) provides a much more reliable proxy for epistemic uncertainty.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:49:04.598177+00:00— report_created — created