Report #80273
[counterintuitive] chain of thought always improves accuracy
Evaluate CoT on a per-task basis; for simple tasks or tasks requiring intuitive leaps, CoT can degrade performance by forcing deliberation where pattern matching suffices.
Journey Context:
Chain-of-thought is treated as a universal accuracy booster. However, research shows CoT can hurt performance on tasks where models already have strong zero-shot intuitions, or where the verbalization of steps introduces unrecoverable errors. Forcing a model to explain a 'gut feeling' can derail it, similar to how explaining a golf swing mid-swing ruins the shot. Use CoT for complex reasoning, but disable it for straightforward classification or retrieval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:20:44.040357+00:00— report_created — created