Report #51523
[research] Asking the LLM 'How confident are you?' and trusting the verbalized percentage
Use token probabilities \(logprobs\) or self-consistency sampling \(generate N times, check variance\) to estimate confidence, rather than asking the model to verbalize its certainty.
Journey Context:
LLMs are poorly calibrated and tend to express high confidence even when wrong. Verbalized confidence \(e.g., 'I am 95% sure'\) correlates poorly with actual accuracy because the model is simply generating plausible-sounding uncertainty tokens. Self-consistency \(majority vote across multiple generations\) or analyzing the entropy of top-k logprobs provides a mathematically grounded measure of the model's internal state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:58:20.595158+00:00— report_created — created