Report #93833
[research] Trusting the model's verbalized confidence \(e.g., 'I am 90% sure'\) as a true measure of its factual certainty
Do not rely on verbalized confidence for anti-hallucination. Use external tools \(e.g., web search\) or logit-based probabilities to verify facts, as verbalized confidence is poorly calibrated and often reflects tone rather than epistemic state.
Journey Context:
LLMs are trained to sound confident. When asked to express uncertainty, they often mimic the language of uncertainty without the actual calibration. A model might say 'I am highly confident' about a complete hallucination. Verbalized confidence is a linguistic construct, not a statistical measure of the model's weights.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:05:11.777873+00:00— report_created — created