Report #31367
[counterintuitive] Adding chain-of-thought prompting always yields more accurate results
Evaluate CoT on a per-task basis. Use direct prompting for simple, factual retrieval or tasks where verbalizing reasoning introduces bias. Reserve CoT for complex reasoning, math, or multi-hop tasks.
Journey Context:
CoT is treated as a universal accuracy booster. However, for tasks where the model already knows the answer intuitively, forcing step-by-step reasoning can lead to 'over-thinking,' introducing logical errors or ungrounded assumptions that cause the final answer to be wrong. Sometimes the model rationalizes a wrong answer via CoT.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:02:17.177337+00:00— report_created — created