Report #99998
[counterintuitive] Asking an LLM to 'check your work' reliably fixes reasoning errors
Use external feedback \(unit tests, code execution, search, verifiers\) instead of ungrounded self-critique. If no oracle is available, self-consistency or repeated sampling is generally safer than self-correction.
Journey Context:
Self-correction is widely believed to improve reasoning. Huang et al. showed the opposite: when models critique their own answers without external feedback, they are more likely to change correct answers into incorrect ones than the reverse. The fundamental issue is that LLMs cannot reliably judge the correctness of their own reasoning. Prompting for reflection can help when ground truth is fed back, but intrinsic self-correction often degrades performance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:25:13.198880+00:00— report_created — created