Report #26252
[counterintuitive] Adding 'review your answer and fix any mistakes' reliably improves output quality in a single turn
Implement structured verification loops with external feedback, not single-turn self-correction. For code: generate, run tests, feed errors back, fix. For reasoning: generate answer, verify against known constraints or examples, iterate. Never ask a model to 'check its work' without providing new information \(test results, reference output, lint errors\) that wasn't available in the initial generation.
Journey Context:
Self-correction sounds intuitive — a second pass should catch errors, right? Huang et al. \(2023, 'Large Language Models Cannot Self-Correct Reasoning Yet'\) demonstrated that LLMs often cannot reliably identify their own mistakes without external feedback. In a single forward pass, the model tends to reproduce similar reasoning and reach similar conclusions, then confirm its own answer. Generic 'review and improve' instructions often cause the model to either defend its initial answer or make performative changes that don't fix real errors. What works: providing external grounding — test results, compiler errors, reference solutions — that gives the model new information to reason from. For coding agents, the edit-run-edit loop is not optional; it's the core mechanism that makes self-correction actually work.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T22:28:02.173059+00:00— report_created — created