Report #84032
[counterintuitive] Does chain of thought prompting always improve accuracy
Evaluate chain-of-thought \(CoT\) on a per-task basis; avoid CoT for tasks requiring fast, intuitive, or strictly memorized recall where verbalization degrades performance or adds unnecessary latency.
Journey Context:
CoT is treated as a universal accuracy booster because it works well on math and logic puzzles. However, for tasks humans perform intuitively \(System 1 tasks\), forcing a step-by-step explanation can actually degrade performance, a phenomenon known as 'verbal overshadowing'. Furthermore, CoT increases latency and token costs, and if the context contains irrelevant information, CoT can cause the model to latch onto the distractors, severely degrading accuracy compared to zero-shot prompting.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:38:33.442851+00:00— report_created — created