Report #15627
[research] Trusting token probabilities as reliable indicators of factual correctness
Do not use raw token probabilities \(logits\) as calibrated confidence scores for factual claims. Use self-consistency \(sampling multiple reasoning paths\) or explicit verbalized uncertainty as a slightly better proxy, but maintain high skepticism.
Journey Context:
LLMs are notoriously miscalibrated; they are often highly confident when wrong. The probability of a token sequence does not map linearly to the likelihood of a fact being true. Developers often try to threshold logits to trigger 'I don't know' behaviors, which fails. Self-consistency \(majority vote over multiple generations\) is computationally expensive but provides a much better empirical signal of factual reliability than single-shot probabilities.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T00:40:52.924073+00:00— report_created — created