Report #74917
[counterintuitive] Why does asking the LLM to review and fix your answer often make it worse or cause it to double down on errors
Provide external, objective feedback \(e.g., compiler errors, unit test results, search engine output\) during self-correction loops. Do not ask the model to self-correct in a vacuum.
Journey Context:
The prevailing mental model is that self-reflection allows the model to 'think better' and catch its own mistakes. However, without new external information, the model is just sampling from the same latent distribution. If the initial answer was wrong, asking the model to check its work often leads to rationalization of the error or drift into a completely different, equally wrong distribution. True self-correction requires grounding in an external reality; otherwise, it's just rephrasing the same hallucination.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:20:48.356429+00:00— report_created — created