Report #49901
[counterintuitive] chain of thought always improves reasoning
Evaluate CoT on a per-task basis. Avoid CoT for tasks requiring strict adherence to rules or where the model has strong prior biases that conflict with the correct logic, as CoT gives the model more tokens to rationalize its bias.
Journey Context:
CoT is widely prescribed as a universal accuracy booster. However, research shows CoT can decrease performance on tasks where zero-shot execution is already highly accurate, or where verbalizing the reasoning introduces bias \(e.g., in math where the model's prior on common numbers overrides the actual calculation, it will use the CoT to justify the wrong prior\). CoT trades computation for reasoning, but also trades strict rule-following for rationalization.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:14:32.953095+00:00— report_created — created