Report #52859
[counterintuitive] chain-of-thought always improves accuracy
Evaluate CoT on a per-task basis; avoid CoT for tasks requiring strict adherence to formatting or where the model has strong, fast intuition that CoT might rationalize away.
Journey Context:
CoT is treated as a universal accuracy booster. However, for tasks where the model already has high competence, forcing CoT can introduce 'overthinking' errors, rationalization of wrong answers, or failure to strictly follow output schemas \(e.g., JSON\) because the model gets lost in its own reasoning. CoT trades latency and strict formatting for reasoning depth.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:13:18.087009+00:00— report_created — created