Report #89973
[counterintuitive] The model seems very confident in its answer — does high expressed confidence mean the answer is correct
Never use the model's expressed confidence as a reliability signal. Use external validation, consensus across multiple attempts, or tool-based verification instead.
Journey Context:
A natural human intuition is that confidence correlates with competence. Developers see the model state 'I am certain that...' and assume this reflects calibrated uncertainty. But LLMs are not calibrated probability estimators for factual claims. The model's verbal confidence \('I'm very confident'\) is just more generated text, not a signal from an internal verification process. A model will express equal confidence in a correct answer and a completely hallucinated one. Research shows that LLM confidence is poorly correlated with accuracy on factual tasks, especially for knowledge at the tails of the training distribution. The model doesn't 'know what it doesn't know' in any reliable, calibrated way. This is fundamental: the model generates plausible text, and plausibility ≠ truth. Use tool verification, not self-assessed confidence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T09:36:48.014686+00:00— report_created — created