Report #79133
[counterintuitive] Does Chain-of-Thought prompting always improve reasoning accuracy
Evaluate CoT on a per-task basis; avoid CoT for tasks requiring strict adherence to prior examples or fast System 1 pattern matching, as it can introduce reasoning errors that override correct intuitions.
Journey Context:
CoT is treated as a universal accuracy booster. However, research shows CoT can decrease performance on tasks where models already have strong intuitive capabilities, or where the verbalized reasoning steps introduce distracting noise or override a correct heuristic with a flawed logical derivation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:25:13.368957+00:00— report_created — created