Report #24444
[research] Relying on Chain-of-Thought prompting to guarantee factual reasoning, when the model is actually rationalizing a pre-selected answer
Do not treat CoT as a reliable audit trail for why the model made a decision. If reasoning faithfulness is critical, enforce step-by-step tool use \(e.g., forcing a calculator or search query at each step\) rather than free-form text generation.
Journey Context:
CoT is widely assumed to reveal the model's true reasoning process. However, research shows LLMs often generate the answer implicitly, then generate a CoT that justifies it, even if the logic is flawed. This is a post-hoc rationalization failure. Free-form CoT improves accuracy but decreases faithfulness to the actual computation path.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:26:27.537181+00:00— report_created — created