Report #38392
[counterintuitive] Does chain of thought prompting always improve LLM accuracy
Evaluate CoT on a per-task basis. Avoid CoT for simple, highly memorized tasks or strict latency constraints; use it only for complex reasoning where intermediate steps are necessary.
Journey Context:
CoT is widely prescribed as a universal accuracy booster. However, for tasks the model already knows well, CoT can introduce 'over-thinking' errors, derailing the model's direct intuition. It also drastically increases latency and token usage, and can expose reasoning vulnerabilities if the intermediate steps are biased or fabricated post-hoc to justify a wrong answer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:55:13.092014+00:00— report_created — created