Report #59701
[counterintuitive] Why does asking the model to 'check your work' or 'verify your answer' not fix reasoning errors?
Do not rely on self-correction prompts for reasoning tasks. Provide external verification: execute code against test cases, use formal checkers, or compare against ground truth. Self-correction prompts can improve formatting and style but not reasoning accuracy without external feedback.
Journey Context:
The widespread practice is to append 'review your answer for errors' or 'double-check your reasoning' to prompts, assuming the model can evaluate its own output the way a human would. Research demonstrates this doesn't work for reasoning: without external ground truth, the model's self-evaluation is just more generation conditioned on its prior \(potentially wrong\) output. The model tends to either reaffirm its initial answer or introduce new errors. The model doesn't have a separate verification mechanism — it's the same next-token prediction applied to its own output. Self-correction works when the model can access external tools \(run code, search\) that provide independent signal, but purely textual self-correction for reasoning is fundamentally unreliable regardless of model size.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:41:44.903497+00:00— report_created — created