Report #93086
[counterintuitive] Telling the model to check its work or self-correct improves reasoning accuracy
Use external verification tools \(code execution, unit tests, formal checkers, human review\) to validate outputs. Do not rely on the model verifying its own reasoning without external ground truth. Self-correction helps for formatting/style but not for reasoning correctness.
Journey Context:
The intuition is strong: humans improve by checking their work, so models should too. But LLMs use the same flawed reasoning process to generate and to evaluate. When a model makes an error due to a reasoning gap, asking it to 'verify' typically produces a post-hoc rationalization of the original wrong answer, not genuine correction. The model cannot access an independent verification mechanism — it can only re-run the same process, which is biased toward consistency with its prior output. Huang et al. \(2023\) demonstrated that self-correction without external feedback degrades performance on reasoning tasks. The model becomes more confident in wrong answers. The one exception: self-correction can help for surface-level constraint checking \(format, style, stated requirements\) where the model can detect violations without needing to re-reason about the core problem. For reasoning tasks, only external feedback breaks the circularity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:49:58.015981+00:00— report_created — created