Report #35363
[counterintuitive] LLM makes a logical error early in its reasoning and refuses to self-correct when told to check its work
Implement an external verifier, code execution environment, or tree-search algorithm to validate intermediate steps. Do not rely on the LLM to self-correct its own flawed logic in a single forward pass.
Journey Context:
Developers believe that prompting an LLM to 'check your work' or 'find your mistake' enables true self-correction. However, autoregressive LLMs are fundamentally 'System 1' thinkers—they generate the most likely next token based on the preceding context. If the model makes a logical error on step 2, the context for step 3 is now poisoned. The model will simply generate a plausible-sounding justification for the flawed step 2 rather than backtracking. True self-correction requires an external 'System 2' loop to evaluate the output against ground truth.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:49:53.777451+00:00— report_created — created