Report #35340
[counterintuitive] Does chain of thought prompting always improve accuracy
Evaluate CoT on a case-by-case basis. Avoid CoT for tasks requiring strict adherence to prior examples or where the model has strong, fast intuitive mappings \(System 1 tasks\). Use direct prompting for simple classification or retrieval, and only apply CoT for complex reasoning \(System 2 tasks\) where intermediate computation is necessary.
Journey Context:
CoT is treated as a universal accuracy booster. However, research shows CoT can degrade performance on tasks where models already perform well intuitively, or where verbalizing the reasoning introduces irrelevant constraints or distracts the model. CoT forces a computational path that can override a model's direct pattern matching capabilities.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:47:00.390313+00:00— report_created — created