Report #97576
[counterintuitive] Prompting the model to 'check your work' improves reasoning accuracy
Do not rely on a model critiquing its own output unless you have an external verifier, execution oracle, or stronger model. Build verification into tools, not prompts.
Journey Context:
The agent-building community widely uses 'reflect and fix' loops, assuming the same model can act as generator and critic. Controlled studies show intrinsic self-correction often does not improve and can degrade accuracy: models spot errors in external reasoning but not in their own traces, and their critiques are shaped by chat-template role labels more than by content. Reliable self-improvement requires an external signal \(test execution, formal verifier, or human\). Use the LLM to generate candidates and a separate process to grade them.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:21:12.154218+00:00— report_created — created