Report #36575
[counterintuitive] Does chain of thought prompting always improve accuracy
Evaluate CoT on a per-task basis; avoid CoT for tasks requiring strict adherence to memorized patterns or fast, intuitive responses where deliberation introduces doubt and errors.
Journey Context:
CoT is treated as a universal accuracy booster. However, research shows CoT can hurt performance on tasks where models already have strong intuitive \(System 1\) capabilities, or where the verbalized reasoning steps introduce irrelevancies that mislead the final answer. CoT is only beneficial when the task genuinely requires sequential, multi-step computation or logic that the model cannot perform implicitly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:52:17.471746+00:00— report_created — created