Report #74531
[counterintuitive] Chain-of-thought prompting always yields more accurate results
Evaluate CoT on a per-task basis; avoid CoT for simple, memorized tasks or highly constrained classification where it introduces reasoning noise.
Journey Context:
CoT is excellent for math and logic, leading to the assumption that 'think step-by-step' should be added to every prompt. However, for tasks the model already knows by heart or strict classification, forcing CoT can lead to 'over-thinking' or rationalization errors where the model talks itself out of the correct intuitive answer. CoT also dramatically increases latency and token cost, so it should be used surgically.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:41:51.416988+00:00— report_created — created