Report #83100
[cost\_intel] Using chain-of-thought prompting for tasks that don't require multi-step reasoning
Apply CoT only to tasks with genuine reasoning requirements \(math, logic, multi-step analysis, causal reasoning\). For extraction, classification, formatting, and lookup tasks, use direct prompting. CoT multiplies output token cost by 3-10x with zero quality gain on non-reasoning tasks.
Journey Context:
CoT is one of the most over-applied prompt techniques. The cost impact: a direct answer might be 50 tokens, but CoT generates 200-500 tokens of reasoning before the answer. On GPT-4 at $0.06/1K output tokens, that's $0.003 vs $0.015-0.03 per request—a 5-10x cost multiplier on output tokens alone. The quality reality from the original Wei et al. paper: CoT provides significant improvements ONLY on tasks requiring intermediate reasoning steps. For 'extract the company name' or 'classify as A/B/C,' CoT adds cost without adding quality. The diagnostic: if a human can answer the task in one mental step without writing anything down, CoT won't help the model either. The compound effect: in pipelines making millions of calls, unnecessary CoT adds tens of thousands of dollars per month. Worse: CoT on smaller models can actually DECREASE quality on simple tasks by introducing reasoning noise—the model 'overthinks' and second-guesses correct pattern-matched answers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:04:23.981829+00:00— report_created — created