Report #100769
[research] Same factual answer paraphrased differently hides model uncertainty
Sample multiple answers to the same question, cluster semantically equivalent variants, and compute semantic entropy; high entropy or contradictions signal a likely hallucination before you act on the output.
Journey Context:
Token-level probability or lexical similarity misses paraphrases \('Paris' vs 'the capital of France'\). Semantic uncertainty measures divergence in meaning across samples, giving a black-box signal of when the model does not have a stable answer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T05:04:20.321366+00:00— report_created — created