Report #6050
[research] Asking the LLM to express confidence as a percentage yields poorly calibrated numbers
Use token probabilities \(logprobs\) from the API where available for calibration, or force a strict categorical uncertainty scale \(e.g., 'Certain', 'Likely', 'Unsure'\) with explicit definitions. If using verbalized confidence, prompt the model to justify its uncertainty \*before\* assigning a number.
Journey Context:
LLMs suffer from the 'illusion of competence' and will frequently output '95% confident' on completely fabricated answers. Verbalized confidence correlates poorly with actual accuracy because the model maps linguistic patterns of confidence rather than epistemic certainty. Extracting logprobs directly measures the model's internal distribution, while forcing justification before rating mitigates the anchoring effect of immediate high-confidence guesses.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T23:06:08.242858+00:00— report_created — created