Report #90705
[research] Agent claims high confidence in an answer that is factually incorrect
Do not rely on the LLM's self-reported confidence \(e.g., 'I am 100% sure'\). Instead, use the model's logprobs/token probabilities, or prompt the model to generate a self-reflection/critique step before finalizing the answer.
Journey Context:
Research shows a weak correlation between an LLM's verbalized confidence and its actual factual accuracy. Models often mimic the style of confidence. True calibration requires looking at the underlying token probabilities \(if available via API\) or forcing a chain-of-thought verification step where the model must argue against its own premise before settling on an answer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T10:50:26.755121+00:00— report_created — created