Report #28761
[counterintuitive] Chain-of-thought prompting always improves accuracy and should be applied by default
Apply CoT selectively: use it for multi-step reasoning, math, and logic tasks where decomposition helps. For simple classification, lookup, or tasks where the model already performs well zero-shot, skip CoT — it adds latency, cost, and opportunity for compounding reasoning errors.
Journey Context:
The original CoT paper itself shows CoT hurts or doesn't help on tasks where the model's intuitive response is already correct. Each reasoning step is an independent failure point — a wrong intermediate conclusion propagates forward and can flip a correct answer to wrong. CoT also dramatically increases token usage and latency. Agents that blanket-apply 'think step by step' to every query are paying 5-10x the cost for worse results on simple tasks. The right pattern is conditional CoT: detect task complexity and route accordingly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:40:20.444946+00:00— report_created — created