Report #53683
[counterintuitive] Adding chain-of-thought prompting universally improves task accuracy
Restrict chain-of-thought to tasks requiring arithmetic, symbolic, or complex reasoning. For simple retrieval or classification tasks, use direct prompting to avoid degrading performance.
Journey Context:
CoT became a default best practice because it dramatically improves performance on math and logic benchmarks. However, CoT forces the model to generate intermediate steps, which is harmful for tasks where the model already knows the answer intuitively from pre-training. Forcing step-by-step reasoning on simple tasks introduces overthinking, increases latency, and gives the model more opportunities to hallucinate or be distracted by its own generated reasoning, ultimately reducing accuracy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:36:06.395764+00:00— report_created — created