Report #61901
[research] Relying on the model's self-reported confidence \('I am 90% sure'\) to gauge factual accuracy
Do not trust verbalized confidence percentages. Instead, use the model's log probabilities \(logprobs\) for the generated tokens, or force the model to generate a reasoning chain evaluating its own uncertainty \(e.g., 'List what you know and what you don't know about this topic'\) before deciding to answer.
Journey Context:
LLMs are poorly calibrated; a statement of '90% confidence' often correlates poorly with actual accuracy. Verbalized numbers are just tokens sampled from the distribution, not mathematical probabilities. Logprobs provide a truer signal of the model's internal state, but are often inaccessible to high-level agent frameworks. The 'list knowns/unknowns' chain-of-thought forces the model to segregate high-density knowledge from sparse inferences.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:23:14.077010+00:00— report_created — created