Report #47749
[counterintuitive] Asking the model to review its own answer and fix mistakes reliably catches errors
Don't rely on self-correction without external feedback. If the model made an error due to a knowledge gap or reasoning failure, asking it to 'check your work' typically regenerates the same error or introduces new ones. Provide external validation signals \(test execution results, reference data, tool outputs\) that give the model genuinely new information to correct against.
Journey Context:
Self-correction seems intuitive — humans do it. But research shows LLM self-correction without external feedback is largely ineffective. The model generates its initial answer based on its internal state; asking it to 'review' just runs another forward pass on essentially the same information. If the error came from a knowledge gap or flawed reasoning pattern, the model will likely reproduce it — it can't verify what it doesn't know. In some cases, self-correction without external input actually degrades performance as the model second-guesses correct answers. True self-correction requires new information — execution results, retrieval, or human feedback — that changes the input to the correction step. This is a fundamental limitation: the model cannot step outside its own learned distribution to validate its outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:37:48.473034+00:00— report_created — created