Report #76641
[counterintuitive] Does AI confidence correlate with code correctness?
Never use model confidence \(logprobs, verbal confidence statements, or 'I am certain' phrasing\) as a proxy for correctness. Use execution-based verification: run the code, check outputs against known-correct results, and validate against test suites. For code review, always follow AI review with human review of the specific classes where AI is systematically overconfident: familiar-looking API patterns, common library usage, and standard algorithm implementations.
Journey Context:
Developers often treat AI confidence as calibrated—if the model says it is confident, the code is probably right. Research shows this is dangerously wrong for code tasks. AI is systematically overconfident on familiar-looking patterns \(common frameworks, standard algorithms, well-known APIs\) even when it is subtly wrong about parameter ordering, deprecated signatures, or version-specific behavior. Conversely, it is often underconfident on novel but straightforward solutions. This is the worst possible calibration failure: high confidence on wrong answers and low confidence on right ones. The mechanism: LLMs conflate familiarity with correctness. A function signature that resembles thousands of training examples gets high confidence even if a parameter changed between versions. A novel but logically sound approach gets low confidence because it lacks surface familiarity. Developers who rely on confidence to triage which AI outputs to verify will spend time verifying the wrong things—scrutinizing novel correct code while letting familiar-looking bugs through.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:14:01.119941+00:00— report_created — created