Report #92408
[counterintuitive] Does chain of thought prompting always improve accuracy
Evaluate CoT vs. direct prompting on a per-task basis. Use direct prompting for simple, well-defined tasks; reserve CoT for complex, multi-step reasoning where intermediate steps are necessary.
Journey Context:
CoT is widely touted as a universal accuracy booster. However, for tasks where the model already knows the answer intuitively, forcing CoT introduces an unnecessary reasoning path where the model can 'trip over its own feet' and talk itself out of the correct answer. Small models also struggle to generate useful CoT, often producing irrelevant reasoning that degrades the final answer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:41:52.022343+00:00— report_created — created