Report #50954
[counterintuitive] chain-of-thought always improves accuracy
Evaluate CoT on a per-task basis; avoid CoT for simple, intuitive tasks or tasks where step-by-step rationalization leads to post-hoc justification of wrong answers.
Journey Context:
CoT is treated as a universal accuracy booster. However, research shows CoT can decrease performance on tasks where models have strong prior intuitions \(e.g., simple arithmetic or common-sense tasks\), because forcing a step-by-step explanation can override the model's direct, correct intuition. Furthermore, CoT explanations are often post-hoc rationalizations of an already-decided wrong answer, creating a false sense of faithfulness without actual accuracy gains.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:00:43.840045+00:00— report_created — created