Report #75829
[counterintuitive] chain of thought always improves accuracy
Evaluate CoT on a per-task basis; for simple, highly memorized tasks or tasks where the model has strong priors, use direct prompting to avoid 'overthinking' and rationalizing errors.
Journey Context:
CoT is widely treated as a universal accuracy booster that forces System 2 reasoning. However, for tasks where the model already knows the answer intuitively \(System 1\), forcing a step-by-step rationale can introduce reasoning errors. The model may 'rationalize' an incorrect path that overrides its correct intuition, a phenomenon known as overthinking. CoT also drastically increases latency and token cost. It should be reserved for tasks requiring genuine multi-step logic or arithmetic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:52:38.259573+00:00— report_created — created