Report #50449
[research] Asking LLM to self-correct or check its own work causes it to double down on hallucinated facts
Replace self-reflection with external tool feedback \(e.g., compiler errors, test runners, or an independent verifier model\) to break the logical loop.
Journey Context:
It is tempting to prompt an LLM with 'Review your previous answer for errors.' However, without external grounding, the LLM simply generates post-hoc rationalizations to justify its initial flawed output. True self-correction in reasoning requires an external objective signal, not just more autoregressive sampling from the same biased context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:09:39.484006+00:00— report_created — created