Report #9222
[research] LLM claims high confidence \('I am 90% sure'\) on answers that are factually wrong
Do not rely on the LLM's self-reported numerical confidence. Instead, measure confidence via generation probability \(logprobs\) or multiple sampling \(self-consistency\). If using verbalized uncertainty, force the model to output a structured confidence score \*after\* generating the reasoning, not before.
Journey Context:
LLMs are poorly calibrated; their verbalized confidence correlates weakly with actual accuracy. Models often mimic human confidence patterns rather than statistical ones. Logprob-based calibration or self-consistency \(sampling N times and taking the majority vote\) provides a mathematically grounded confidence signal, whereas verbalized confidence is just another text generation prone to hallucination.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T07:39:53.155706+00:00— report_created — created