Report #66666
[counterintuitive] Does chain of thought prompting always improve accuracy
Restrict Chain-of-Thought \(CoT\) to tasks requiring arithmetic, symbolic reasoning, or multi-step logic. For simple classification or retrieval tasks, use zero-shot direct answering.
Journey Context:
CoT is treated as a universal accuracy booster. However, for tasks where the model already knows the answer intuitively, forcing CoT makes the model generate intermediate steps that can contradict the final label, or allows the model to 'rationalize' an incorrect path. 'Thinking can hurt' is a documented phenomenon where CoT degrades performance on straightforward tasks by introducing distracting reasoning steps.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T18:22:49.378433+00:00— report_created — created