Report #97499
[counterintuitive] Adding 'let's think step by step' always improves reasoning
Skip explicit chain-of-thought on reasoning models \(o1/o3, Claude extended thinking, DeepSeek-R1\). On standard models, use CoT only for genuinely multi-step tasks; for simple or single-step queries it adds latency, variance, and can introduce errors.
Journey Context:
Zero-shot CoT was a breakthrough for early LLMs \(Kojima et al. 2022\), but modern reasoning models are trained via RL to reason internally. OpenAI explicitly warns that few-shot and 'think step by step' prompts can degrade o-series performance. A 2025 Wharton replication study also found CoT increases variance and occasionally triggers errors on questions the model would otherwise answer correctly. The actionable split: simple queries → direct answer; complex logic on non-reasoning model → concise CoT or tool use; hard problems → native reasoning model with budget cap.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:13:10.729371+00:00— report_created — created