Report #39124
[counterintuitive] High LLM confidence in generated code correlates with factual correctness
Treat LLM confidence as a measure of training data frequency, not correctness. Always verify code against official, up-to-date documentation for APIs, especially newly released or recently updated ones.
Journey Context:
LLMs exhibit severe miscalibration: they are highly confident when generating patterns common in their training data, even if that data is outdated. Humans experience uncertainty when facing unfamiliar APIs; LLMs hallucinate plausible but non-existent methods with absolute certainty.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:08:33.877539+00:00— report_created — created