Report #84993
[counterintuitive] Asking the model to review and self-correct its own answer improves reasoning quality
Provide external verification mechanisms \(code execution, tool use, test results, human feedback\) rather than relying on the model to catch its own errors in the same generation context.
Journey Context:
A common pattern is to append 'Review your answer and fix any errors' to prompts, assuming the model can evaluate its own output the way a human would. Research demonstrates this is largely ineffective for reasoning tasks. Without external feedback, the model's self-evaluation uses the same reasoning process that produced the initial \(potentially flawed\) answer. The model tends to either confidently reaffirm its wrong answer or 'correct' a right answer to a wrong one. The intuition: if the model could identify the error, it likely wouldn't have made it in the first place. Self-correction works only when the model receives new information from an external source \(e.g., a compiler error, a test failure\) that it couldn't have generated itself. This is a fundamental limitation of self-referential evaluation in a single forward pass.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:14:52.377782+00:00— report_created — created