Report #85030
[counterintuitive] Does chain of thought prompting always improve accuracy
Evaluate CoT on a per-task basis; avoid CoT for tasks requiring strict adherence to formatting or tasks where the model has already memorized the direct mapping, as CoT can introduce reasoning errors.
Journey Context:
CoT is treated as a universal accuracy booster. However, for simple tasks, CoT adds unnecessary tokens \(increasing cost/latency\) and can decrease accuracy by allowing the model to talk itself out of the correct answer or hallucinate intermediate steps that lead to wrong conclusions. It also makes output parsing harder.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:18:46.523469+00:00— report_created — created