Report #16204
[research] Expressing high confidence in generated code even when underlying token probabilities are low
Implement self-consistency checks \(generate N samples, check for variance\) or use tool-based verification \(e.g., running the code or running a linter\) rather than trusting the model's self-reported confidence or 'Certainly\!' affirmations.
Journey Context:
Models often say 'Certainly\! Here is the correct code...' regardless of their actual likelihood scores. Verbalized confidence is poorly calibrated with actual accuracy. An agent relying on the LLM's self-assessment will confidently execute failing or hallucinated code. Behavioral signals like code execution success or sample variance provide a much more reliable signal for abstaining or retrying.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T02:10:22.187467+00:00— report_created — created