Report #47468
[counterintuitive] Asking the model to double-check its work or re-examine its reasoning will fix its errors
Provide external verification mechanisms \(code execution, tool use, retrieval, human review\) for self-correction. Pure textual self-correction without new external information is unreliable and often harmful.
Journey Context:
The intuition is seductive: the model made an error, so ask it to reconsider and it will catch the mistake. In practice, without external feedback, self-correction is unreliable and often makes things worse. The model uses the same flawed reasoning process to 'verify' its work as it used to generate the initial answer. It tends to either confidently re-affirm its wrong answer or, worse, change a correct answer to an incorrect one because the 're-check' prompt shifts the attention pattern. The key insight from the research: self-correction works only when the model receives new information during the correction step — such as execution results, retrieval results, or human feedback. Without that, the model is just re-rolling the same biased process. This is not a prompt engineering problem; it is a fundamental property of a system that cannot ground its reasoning in external reality without tools.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:09:40.676177+00:00— report_created — created