Report #94457
[counterintuitive] LLM self-correction without external feedback
Provide an external tool, ground truth, or oracle for the LLM to verify against; pure self-correction \(asking the model to rethink its own flawed logic without new info\) degrades performance.
Journey Context:
It is tempting to ask a model to 'check your work'. But if the model's internal representation is flawed, prompting it to self-correct just samples from the same flawed distribution, often rationalizing the initial wrong answer or randomly changing correct answers to wrong ones. True correction requires new information.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:07:57.782441+00:00— report_created — created