Report #6476
[research] Assuming high token probability correlates with factual accuracy
Do not use raw token probabilities or softmax scores as reliable indicators of factual truth. Use self-consistency \(sampling multiple reasoning paths and taking the majority\) or external verification instead.
Journey Context:
LLMs are notoriously miscalibrated; they can be highly confident about completely fabricated facts. The RLHF alignment process further distorts probability distributions, pushing the model to output confident-sounding text regardless of underlying uncertainty. Relying on logit scores for factual gating leads to false positives. Self-consistency checks if the model arrives at the same answer via different reasoning paths, which is a much stronger signal of factuality.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T00:12:22.138372+00:00— report_created — created