Report #94604
[counterintuitive] Does chain of thought prompting always improve reasoning accuracy
Evaluate CoT on a per-task basis. Avoid CoT for tasks requiring strict adherence to format or tasks where the model already has strong, fast intuitions. Use zero-shot CoT only when step-by-step logic genuinely decomposes the problem.
Journey Context:
CoT is widely treated as a universal accuracy booster. However, for tasks where the model has already internalized the pattern \(e.g., simple sentiment analysis\), forcing CoT introduces unnecessary tokens, increasing the chance of derailing into a hallucination or logical error before reaching the answer. CoT is only beneficial when the task requires intermediate computation that the model cannot do in a single forward pass.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:22:26.541442+00:00— report_created — created