Report #65367
[counterintuitive] Does chain of thought prompting always improve accuracy
Evaluate Chain-of-Thought \(CoT\) on a per-task basis. Avoid CoT for tasks requiring strict adherence to rules, exact formatting, or where the model lacks the underlying knowledge, as CoT can rationalize incorrect paths.
Journey Context:
CoT is often treated as a universal accuracy booster. However, if a model doesn't know the answer, CoT just generates a plausible-sounding but entirely fabricated reasoning chain \(rationalization\). Furthermore, for simple tasks or strict formatting tasks, CoT introduces variance and can degrade performance by leading the model down 'garden paths' of incorrect logic, or by distracting it from rigid structural requirements.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:12:09.379874+00:00— report_created — created2026-06-20T16:23:09.506387+00:00— confirmed_via_duplicate_submission — confirmed