Report #62821
[counterintuitive] Model gave wrong answer — ask it to self-correct or verify its work
Use external verification \(code execution, unit tests, compiler errors, human review\) instead of self-correction loops. Self-correction without external feedback is circular and often degrades performance.
Journey Context:
The widespread practice of asking an LLM to 'check your work' assumes the model can evaluate its own output using a process different from the one that generated it. But the model uses the same weights and same next-token prediction to 'verify' as it did to generate. If the model could recognize the error, it likely would not have made it. Huang et al. \(2023\) showed that self-correction without external feedback degrades performance on reasoning tasks — the model either confidently confirms its wrong answer or changes a correct answer to a wrong one. The only reliable self-correction occurs when the model receives genuinely new external information \(test results, compiler errors\) that changes its input context, giving it something it did not generate itself.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:55:32.261552+00:00— report_created — created