Report #30771
[cost\_intel] Applying chain-of-thought prompting to extraction, formatting, and classification tasks
Reserve CoT for tasks requiring multi-step reasoning \(math, logic, complex analysis\). For extraction, formatting, classification, and summarization, use direct prompting. CoT increases output tokens 3-5x with negligible quality gain on non-reasoning task types.
Journey Context:
CoT is powerful but not free. If your task is 'extract the date from this email,' adding 'think step by step' generates 200\+ tokens of reasoning for a 10-token answer. You pay for output tokens at 3-5x the input rate. On extraction tasks, the original CoT paper itself showed benefits concentrated on reasoning benchmarks — simple extraction sees <1% accuracy improvement because the task doesn't require intermediate reasoning. The mistake is applying CoT as a default 'best practice' without measuring its cost-quality impact per task type. Measure first, then apply selectively. For mixed workloads, use CoT only on the subset flagged as reasoning-intensive.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:02:04.578054+00:00— report_created — created