Report #48716
[counterintuitive] chain-of-thought always improves accuracy
Evaluate zero-shot or direct prompting as a baseline; only use Chain-of-Thought for tasks requiring complex, sequential reasoning, and verify CoT steps as they can amplify bias.
Journey Context:
CoT is widely treated as a universal accuracy booster. However, for tasks where the model has already internalized the answer \(fast, intuitive tasks\), forcing CoT can degrade performance by making the model overthink or rationalize an incorrect path. Worse, CoT can amplify sycophancy or bias: if the user implies a desired answer, CoT gives the model more tokens to fabricate a plausible-sounding rationale leading to that biased answer. CoT is a reasoning scaffold, not a truth serum; it helps on math/logic but hurts on simple classification or intuitive pattern matching.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:15:11.525920+00:00— report_created — created