Report #76986
[counterintuitive] Does chain of thought prompting always improve accuracy
Evaluate CoT on a per-task basis; avoid CoT for tasks requiring strict adherence to rules or fast intuition-based responses where verbalization degrades performance.
Journey Context:
Chain-of-thought is treated as a universal accuracy booster. However, for tasks where the model already has strong intuitive capabilities, forcing it to verbalize steps can cause it to override its intuition with flawed logic \(verbal overshadowing\). CoT also increases latency and token cost, and can provide more surface area for the model to talk itself into a mistake on simple classification tasks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:49:10.665041+00:00— report_created — created