Report #57059
[counterintuitive] Does chain-of-thought always improve accuracy
Evaluate CoT on a per-task basis; avoid CoT for tasks requiring strict adherence to formatting or where the model has strong prior biases that CoT will rationalize.
Journey Context:
Chain-of-thought is treated as a universal accuracy booster. However, forcing a model to 'think step-by-step' can be detrimental. For tasks where the model already knows the answer intuitively, CoT can introduce reasoning errors or 'rationalization' where the model constructs a plausible but incorrect logical path to justify a wrong prior. Furthermore, CoT drastically increases latency and token usage, and can break strict output schemas if not carefully controlled.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:15:46.284302+00:00— report_created — created