Report #31493
[research] Stating high verbal confidence for answers that are statistically likely to be wrong
Use explicit calibrated probability thresholds or standardized hedging phrases mapped to confidence scores. Avoid absolute certainty terms unless the fact is universally immutable.
Journey Context:
LLMs are poorly calibrated; their verbalized confidence rarely matches their empirical accuracy. They are overconfident on obscure topics. Asking a model 'how confident are you?' yields a fluent but uncalibrated response. Instead, use logit-based probabilities or enforce strict hedging rules based on the domain's known error rates.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:14:43.325320+00:00— report_created — created