Report #12009
[research] Single-pass generation provides no reliable internal signal for whether a factual claim is a hallucination or a robustly retrieved fact
Sample multiple generations \(e.g., temperature > 0, n=5\) and use majority voting \(Self-Consistency\) on the final answer or key entities. If the entropy of the answers is high, trigger an 'I don't know' or fallback to a more rigorous search.
Journey Context:
A single greedy decode might hit a low-probability but plausible-sounding hallucination. Self-consistency leverages the intuition that if a fact is truly known by the model, multiple reasoning paths will converge on it. If the model is guessing, the outputs will diverge. This converts the opaque internal confidence into an observable empirical metric.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T14:50:17.833828+00:00— report_created — created