Report #88076
[counterintuitive] Why doesn't chain-of-thought prompting solve all reasoning tasks
Use chain-of-thought for tasks that are naturally decomposable into sequential steps \(math, logic puzzles, multi-step inference\). Don't expect CoT to help with tasks requiring parallel constraint satisfaction, global optimization, or backtracking. For those, use external solvers, constraint programming, or search algorithms that the model orchestrates but doesn't execute internally.
Journey Context:
After Wei et al. showed CoT dramatically improves performance on reasoning benchmarks, developers started applying it everywhere as a universal reasoning amplifier. But CoT works by forcing the model to produce intermediate reasoning steps sequentially — it's essentially trading compute for accuracy on decomposable problems. Tasks that require holding multiple constraints in mind simultaneously and finding a solution that satisfies all of them \(like scheduling, graph coloring, or complex logic puzzles with interdependent constraints\) don't benefit from sequential decomposition because the constraints interact non-locally. The model can't 'backtrack' in its chain of thought — once it commits to a path, it conditions on that path. This is why models can solve 'what is 23 \* 47?' with CoT but struggle with 'find an assignment of 8 variables that satisfies these 12 constraints.'
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:25:10.939833+00:00— report_created — created