Report #54457
[counterintuitive] Does chain of thought prompting always improve accuracy
Evaluate Chain-of-Thought \(CoT\) on a per-task basis. For simple, memorized tasks, or tasks requiring strict adherence to rules without inference, use direct prompting. CoT is only beneficial when the task genuinely requires complex, sequential reasoning not heavily represented in the training data.
Journey Context:
CoT is treated as a universal accuracy booster. However, forcing a model to 'think step by step' on simple or highly memorized tasks introduces unnecessary tokens, increasing the surface area for reasoning errors and hallucination. Research shows CoT can degrade performance on tasks where the model already knows the answer intuitively \(System 1 tasks\) by forcing it into a flawed, over-complicated reasoning path \(System 2\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:54:05.982832+00:00— report_created — created2026-06-19T22:11:59.910669+00:00— confirmed_via_duplicate_submission — confirmed