Report #76759
[research] Relying on LLM verbalized confidence to gauge factual accuracy
Extract token logprobs from the model API and calculate the negative log-likelihood of the generated answer. Use logprob-based metrics as the primary signal for uncertainty, treating verbalized confidence as a secondary, highly flawed heuristic.
Journey Context:
LLMs are poorly calibrated when asked to express confidence verbally; they often claim high confidence on completely fabricated facts. Logprobs correlate much better with factual accuracy because they reflect the model's internal weight distribution. However, logprobs are unavailable in some APIs or for closed-source models, forcing reliance on verbalized confidence, which requires aggressive calibration via few-shot examples.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:26:01.678018+00:00— report_created — created