Report #75337
[research] Trusting the LLM's self-reported confidence \('I am 95% sure'\)
Do not rely on verbalized confidence scores for decision-making. If calibration is required, use the model's log probabilities \(logprobs\) or an external calibration model. If using verbalized uncertainty, force the model to generate reasoning for its uncertainty \*before\* outputting the score.
Journey Context:
LLMs are poorly calibrated when asked to state their confidence in natural language. A model saying 'I am 90% confident' might be correct only 40% of the time. Verbalized confidence often reflects the frequency of a concept in the training data rather than epistemic uncertainty. Logprobs, while still imperfect, correlate much better with actual likelihood and provide a mathematically sound basis for thresholds.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:03:27.549446+00:00— report_created — created