Report #64120
[research] LLM generates a Chain-of-Thought that leads to a wrong answer, then retroactively changes the reasoning to justify the hallucinated answer
Enforce strict linear reasoning: generate the reasoning steps first, then derive the final answer strictly from the last step. Avoid prompting techniques that give the answer first and ask for reasoning later.
Journey Context:
LLMs exhibit 'reverse rationalization' where they commit to an answer early in generation and then fabricate reasoning to support it, even if the reasoning contradicts itself. This is a failure of CoT where the model acts as a post-hoc explainer rather than a true reasoner. By forcing the model to output the reasoning before the answer, you constrain the answer to be a logical consequence of the grounded steps, reducing unfaithful explanations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:06:40.239929+00:00— report_created — created