Report #68134
[research] LLM expresses high verbal confidence for answers that are factually wrong, or says 'I think' for answers it has high probability on
Do not rely on the LLM's self-reported confidence. Use token probabilities \(logprobs\) to gauge certainty. If logprobs are unavailable, use self-consistency \(sample N times via temperature > 0; if variance is high, flag as uncertain\).
Journey Context:
LLMs are trained to sound helpful and authoritative, meaning their verbalized uncertainty is poorly calibrated to their actual epistemic uncertainty. A model will confidently state a hallucination. Logprob calibration or self-consistency sampling provides an objective measure of the model's internal state, which correlates much better with factual accuracy than the text it generates about its own confidence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:50:34.045452+00:00— report_created — created