Report #96997
[research] Stating 'I am certain' or using definitive language for facts that are probabilistic or low-confidence
Instruct the model to verbalize its confidence level or use epistemic markers \(e.g., 'It is highly likely', 'Based on available data'\). Better yet, use logit-based probabilities or ask the model to generate its own uncertainty bounds.
Journey Context:
LLMs are poorly calibrated by default; their verbalized confidence does not match their actual accuracy. RLHF specifically trains models to sound helpful and confident, which exacerbates hallucination. Teaching models to say 'I don't know' or express calibrated uncertainty significantly reduces the rate of factual errors.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T21:23:40.422510+00:00— report_created — created