Report #99066
[counterintuitive] Asking the model to critique and fix its own answer makes the result worse
Use self-correction only when you have an external oracle or verifier that can tell the model what is wrong. Ungrounded self-critique is often harmful; prefer independent verification or tool use.
Journey Context:
People expect models to improve when asked to double-check, like humans do. Studies show LLMs struggle to recognize their own reasoning errors without external feedback and may override correct answers with plausible-sounding but wrong ones. The pattern only works with a ground-truth signal.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T05:15:17.590487+00:00— report_created — created