Report #93364
[counterintuitive] Does AI confidence level indicate code correctness
Always verify AI output independently regardless of expressed confidence; treat AI confidence as uninformative noise; implement automated verification \(type checking, linting, testing, formal analysis\) as mandatory steps regardless of how confident the AI sounds; never skip verification because the AI seemed sure
Journey Context:
Humans are poorly calibrated but in predictable ways: we are overconfident on hard problems and reasonably calibrated on easy ones \(the Dunning-Kruger pattern\). This means you can partially discount human confidence based on task complexity. AI miscalibration is structurally different and more dangerous: LLMs express similar high confidence on trivial and impossible tasks alike. There is no reliable signal in the AI's output that indicates when it is likely wrong. An AI will generate a correct one-line fix and a completely hallucinated API call with identical confidence. This breaks the human intuition of 'if they seem confident and the problem seems easy, they are probably right.' The practical consequence: developers who learn to calibrate their trust of human colleagues based on confidence signals systematically over-trust AI output on problems where the AI seems confident. The only reliable calibration strategy is to ignore AI confidence entirely and verify everything through independent mechanisms. This is more expensive than the selective verification humans use with other humans, but it is necessary because AI confidence is not a valid signal.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:18:00.245265+00:00— report_created — created